lift_utils.lexicon

Manipulate lexicon entries and their dependent elements.

Module Contents

Classes

Entry

This is the core of a lexicon. A lexicon is made up of a set of entries.

Etymology

For describing lexical relations with a word not in the lexicon.

Example

Gives an example sentence or phrase.

GrammaticalInfo

A reference to a range-element in the grammatical-info range.

Lexicon

This is the main class of the lexicon.

Note

For storing descriptive information of many kinds.

Phonetic

This represents a single pronunciation in phonetic form.

Relation

This element is used for lexical relations.

Reversal

Enables a wider use of a dictionary.

Sense

An entry is made up of a number of sense elements.

Translation

A Multitext with an optional translation type attribute.

Variant

Variant elements are used for all sorts of variation.

class lift_utils.lexicon.Entry(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Extensible

This is the core of a lexicon. A lexicon is made up of a set of entries.

Variables:
  • id (Optional[RefId]) – This gives a unique identifier to this entry.

  • guid (Optional[str]) – Deprecated. This gives a unique identifier to this entry in the form of a “universally unique identifier” (RFC 4122).

  • order (Optional[int]) – This is the homograph number.

  • date_deleted (Optional[DateTime]) – If this attribute exists then it indicates that the particular entry has been deleted.

  • lexical_unit (Optional[Multitext]) – The lexical form is the primary lexical form as is found as the primary lexical form in the source data models for this standard.

  • citation (Optional[Multitext]) – This is the form that is to be printed in the dictionary.

  • pronunciation_items (Optional[List[Phonetic]]) – There can be multiple phonetic forms of an entry.

  • variant_items (Optional[List[Variant]]) – Any constrained variants or free orthographic variants.

  • sense_items (Optional[List[Sense]]) – This is where the definition goes.

  • note_items (Optional[List[Note]]) – The more notes you keep the better.

  • relation_items (Optional[List[Relation]]) – Gives a lexical relationship between this entry and another entry or sense.

  • etymology_items (Optional[List[Etymology]]) – Differs from a lexical relation in that it has no referent in the lexicon.

XML_TAG = 'entry'
add_etymology() Etymology

Add an empty Etymology item to the entry. Returns the new object, which can then be used to add data to it.

add_note() Note

Add an empty Note item to the entry. Returns the new object, which can then be used to add data to it.

add_pronunciation() Phonetic

Add an empty Phonetic item to the entry. Returns the new object, which can then be used to add data to it.

add_relation() Relation

Add an empty Relation item to the entry. Returns the new object, which can then be used to add data to it.

add_sense() Sense

Add an empty Sense item to the entry. Returns the new object, which can then be used to add data to it.

add_variant() Variant

Add an empty Variant item to the entry. Returns the new object, which can then be used to add data to it.

get_id() lift_utils.datatypes.RefId

Return the object’s unique identifier

set_citation(forms_dict=None)

Set the entry’s Citation.

Variables:

forms_dict (Optional[dict]) – dict keys are language codes, values are the text descriptions of the Citation.

set_lexical_unit(forms_dict=None)

Set the entry’s LexicalUnit.

Variables:

forms_dict (Optional[dict]) – dict keys are lanuage codes, values are text descriptions of the LexicalUnit.

class lift_utils.lexicon.Etymology(etym_type: lift_utils.datatypes.Key = None, source: str = None, xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Extensible

For describing lexical relations with a word not in the lexicon.

Variables:
  • type (Key) – Gives the etymological relationship between this sense and some other word in another language.

  • source (str) – Gives the language for the source language of the etymological relation.

  • gloss_items (Optional[List[Gloss]]) – Gives glosses of the word that the etymological relationship is with.

  • form (Optional[Form]) – Holds the form of the etymological reference.

XML_TAG = 'etymology'
add_gloss(lang, text)
set_form()
class lift_utils.lexicon.Example(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Multitext, lift_utils.base.Extensible

Gives an example sentence or phrase. It is given in the language and glosses of that example in other languages.

Variables:
  • source (Optional[Key]) – Reference by which another application may refer to this example or is a reference into another database of texts, for example.

  • translation_items (Optional[List[Translation]]) – Gives translations of the example into different languages.

  • note_items (Optional[List[Note]]) – Holds notes on this example.

XML_TAG = 'example'
class lift_utils.lexicon.GrammaticalInfo(value: lift_utils.datatypes.Key = None, xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.LIFTUtilsBase

A reference to a range-element in the grammatical-info range.

Variables:
  • value (Key) – The part of speech tag into the grammatical-info range.

  • trait_items (Optional[List[Trait]]) – Allows the grammatical information for a given sense to have more information than just the part of speech given by the value attribute.

XML_TAG = 'grammatical-info'
class lift_utils.lexicon.Lexicon(path: pathlib.Path | str | None = None, version: str = None, xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.LIFTUtilsBase

This is the main class of the lexicon. It contains the header and all the entries in the database.

Variables:
  • version (str) – Specifies the lift language version number.

  • producer (Optional[str]) – Identifies the particular producer of this lift file.

  • header (Optional[Header]) – Contains the header information for the database.

  • entry_items (Optional[List[Entry]]) – Each of the entries in the lexicon.

  • path (Optional[Path]) – File path to a LIFT file to import.

XML_TAG = 'lift'
add_entry() Entry

Add an empty Entry to the lexicon. Returns the Entry object, which can then be used to add data to it.

find(text: str, field: str = 'gloss', match_type: str = 'contains') Entry | Sense | None

Return the first matching Entry or Sense item. The field searched can be the entry’s “lexical-unit” or “variant” field, the sense’s “gloss” [default], “definition”, or “grammatical-info” field, as well as any fields defined in the LIFT file’s header.

Variables:
  • text (str) – The search term.

  • field (str) – The field to be searched [default is “gloss”].

  • match_type (str) – The kind comparison between the search term and the field’s data. Possible values are “contains” [default], “exact”, or “regex”.

find_all(text: str = '', field: str = 'gloss', match_type: str = 'contains') List[Entry | Sense]

Return all matching Entry or Sense items. The field searched can be the entry’s “lexical-unit” or “variant” field, the sense’s “gloss” [default], “definition”, or “grammatical-info” field, as well as any fields defined in the LIFT file’s header.

Variables:
  • text (str) – The search term.

  • field (str) – The field to be searched [default is “gloss”].

  • match_type (str) – The kind comparison between the search term and the field’s data. Possible values are “contains” [default], “exact”, or “regex”.

get_item_by_id(refid: str) Entry | Sense | None

Return an entry or sense by its id attribute.

Variables:

refid (str) – The id attribute of the entry or sense.

get_range_elements(range_name)

Returns a generator object that lists all the names defined in the header for the given range.

Variables:

range_name (str) – The name of the header range.

get_ranges()

Returns a generator object that lists all the range names defined in the header.

show()

Print an overview of the Lexicon in the terminal window.

to_lift(file_path: str)

Save the Lexicon as a LIFT file. The LIFT-RANGES file will be automatically created in the same folder as the LIFT file.

Variables:

file_path (str) – Full or relative path to new LIFT file.

class lift_utils.lexicon.Note(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Multitext, lift_utils.base.Extensible

For storing descriptive information of many kinds. It can include comments, bibliographic information and domain specific notes.

Variables:

type (Optional[Key]) – Gives the type of note by reference to a range-element in the note-type range.

XML_TAG = 'note'
class lift_utils.lexicon.Phonetic(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Multitext, lift_utils.base.Extensible

This represents a single pronunciation in phonetic form.

Variables:
  • media_items (Optional[List[URLRef]]) – Stores an audio representation of the text.

  • form_items (Optional[List[Span]]) – Used by LIFT v0.13 (FieldWorks). Stores the phonetic representation using whichever writing system: IPA, Americanist, etc.

XML_TAG = 'pronunciation'
add_form(lang, text)
add_media(href=None, label=None)
class lift_utils.lexicon.Relation(rel_type: lift_utils.datatypes.Key = None, ref: lift_utils.datatypes.RefId = None, xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Extensible

This element is used for lexical relations.

Variables:
  • type (Key) – Is the type of the particular lexical relation.

  • ref (RefId) – This is the other end of the relation, either a sense or an entry.

  • order (Optional[int]) – Gives the relative ordering of relations of a given type when a multiple relation is being described.

  • usage (Optional[Multitext]) – Gives information on usage in one or more languages or writing systems.

XML_TAG = 'relation'
class lift_utils.lexicon.Reversal(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Multitext

Enables a wider use of a dictionary.

Variables:
  • type (Optional[Key]) – Gives the type of the reversal as a range-element in the reversal-type range.

  • main (Optional[Reversal]) – This gives the parent reversal in a hierarchical set of reversals.

  • grammatical_info (Optional[GrammaticalInfo]) – This allows a reversal relation to specify what the grammatical information is in the reversal language.

XML_TAG = 'reversal'
class lift_utils.lexicon.Sense(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Extensible

An entry is made up of a number of sense elements.

Variables:
  • id (Optional[RefId]) – This gives an identifier for this Sense so that things can refer to it.

  • order (Optional[int]) – A number that is used to give the relative order of senses within an entry.

  • grammatical_info (Optional[GrammaticalInfo]) – Grammatical information.

  • gloss_items (Optional[List[Union[Gloss, Form]]]) – Each gloss is a single string in a single language and writing system. Form is used by LIFT v0.13 (FieldWorks), while Gloss is used in later versions.

  • definition (Optional[Multitext]) – Gives the definition in multiple languages or writing systems.

  • relation_items (Optional[List[Relation]]) – While a lexical relation isn’t strictly owned by a sense it is a good place to hold it.

  • note_items (Optional[List[Note]]) – There are lots of different types of notes.

  • example_items (Optional[List[Example]]) – Examples may be used for different target audiences.

  • reversal_items (Optional[List[Reversal]]) – There may be different reversal indexes.

  • illustration_items (Optional[List[URLRef]]) – The picture doesn’t have to be static.

  • subsense_items (Optional[List[Sense]]) – Senses can form a hierarchy.

XML_TAG = 'sense'
add_example() Example

Add an empty Example item to the sense. Returns the new object, which can then be used to add data to it.

add_gloss(lang, text) lift_utils.base.Gloss

Add a Gloss item to the sense. Returns the new object.

Variables:
  • lang (str) – The gloss’s language.

  • text (str) – The actual gloss text.

add_illustration(href=None) lift_utils.base.URLRef

Add a URLRef illustration item to the sense. Returns the new object, which can then be used to add data to it.

add_note() Note

Add an empty Note item to the sense. Returns the new object, which can then be used to add data to it.

add_relation() Relation

Add an empty Relation item to the sense. Returns the new object, which can then be used to add data to it.

add_reversal() Reversal

Add an empty Reversal item to the sense. Returns the new object, which can then be used to add data to it.

add_subsense()

Add an empty Subense item to the sense. Returns the new object, which can then be used to add data to it.

get_gloss(lang='en') str

Get the gloss for a given language. Defaults to English; falls back otherwise to the first gloss.

get_grammatical_info() GrammaticalInfo

Return the grammatical-info part of speech value.

get_id() lift_utils.datatypes.RefId

Return the object’s id attribute.

set_definition(forms_dict=None)

Set the sense’s definition.

Variables:

forms_dict (Optional[dict]) – dict keys are language codes, values are the text for each definition.

set_grammatical_info(value: str)

Set the sense’s GrammaticalInfo.

Variables:

value (str) – The part of speech tag in the grammatical-info range.

class lift_utils.lexicon.Translation(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Multitext

A Multitext with an optional translation type attribute.

Variables:

type (Optional[Key]) – Gives the type of the translation.

XML_TAG = 'translation'
class lift_utils.lexicon.Variant(xml_tree: lxml.etree._Element | None = None, **kwargs)

Bases: lift_utils.base.Multitext, lift_utils.base.Extensible

Variant elements are used for all sorts of variation.

Variables:
  • ref (Optional[RefId]) – Gives the variation as a reference to another entry or sense rather than specifying the form (that is, the Multitext value of the variant).

  • pronunciation_items (Optional[List[Phonetic]]) – Holds the phonetic variant whether it is that this is a variation in phonetics only or that the phonetic variation arises because of an orthographic or phonemic variation.

  • relation_items (Optional[List[Relation]]) – Some variants have a lexical relationship with other senses or entries in the lexicon.

XML_TAG = 'variant'