lift_utils.lexicon¶
Manipulate lexicon entries and their dependent elements.
Module Contents¶
Classes¶
This is the core of a lexicon. A lexicon is made up of a set of entries. |
|
For describing lexical relations with a word not in the lexicon. |
|
Gives an example sentence or phrase. |
|
A reference to a |
|
This is the main class of the lexicon. |
|
For storing descriptive information of many kinds. |
|
This represents a single pronunciation in phonetic form. |
|
This element is used for lexical relations. |
|
Enables a wider use of a dictionary. |
|
An |
|
A |
|
|
- class lift_utils.lexicon.Entry(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.ExtensibleThis is the core of a lexicon. A lexicon is made up of a set of entries.
- Variables:
id (Optional[RefId]) – This gives a unique identifier to this
entry.guid (Optional[str]) – Deprecated. This gives a unique identifier to this entry in the form of a “universally unique identifier” (RFC 4122).
order (Optional[int]) – This is the homograph number.
date_deleted (Optional[DateTime]) – If this attribute exists then it indicates that the particular
entryhas been deleted.lexical_unit (Optional[Multitext]) – The lexical form is the primary lexical form as is found as the primary lexical form in the source data models for this standard.
citation (Optional[Multitext]) – This is the form that is to be printed in the dictionary.
pronunciation_items (Optional[List[Phonetic]]) – There can be multiple phonetic forms of an entry.
variant_items (Optional[List[Variant]]) – Any constrained variants or free orthographic variants.
sense_items (Optional[List[Sense]]) – This is where the definition goes.
note_items (Optional[List[Note]]) – The more notes you keep the better.
relation_items (Optional[List[Relation]]) – Gives a lexical relationship between this entry and another
entryorsense.etymology_items (Optional[List[Etymology]]) – Differs from a lexical relation in that it has no referent in the lexicon.
- XML_TAG = 'entry'¶
- add_etymology() Etymology¶
Add an empty
Etymologyitem to the entry. Returns the new object, which can then be used to add data to it.
- add_note() Note¶
Add an empty
Noteitem to the entry. Returns the new object, which can then be used to add data to it.
- add_pronunciation() Phonetic¶
Add an empty
Phoneticitem to the entry. Returns the new object, which can then be used to add data to it.
- add_relation() Relation¶
Add an empty
Relationitem to the entry. Returns the new object, which can then be used to add data to it.
- add_sense() Sense¶
Add an empty
Senseitem to the entry. Returns the new object, which can then be used to add data to it.
- add_variant() Variant¶
Add an empty
Variantitem to the entry. Returns the new object, which can then be used to add data to it.
- get_id() lift_utils.datatypes.RefId¶
Return the object’s unique identifier
- set_citation(forms_dict=None)¶
Set the entry’s
Citation.- Variables:
forms_dict (Optional[dict]) –
dictkeys are language codes, values are the text descriptions of theCitation.
- set_lexical_unit(forms_dict=None)¶
Set the entry’s
LexicalUnit.- Variables:
forms_dict (Optional[dict]) –
dictkeys are lanuage codes, values are text descriptions of theLexicalUnit.
- class lift_utils.lexicon.Etymology(etym_type: lift_utils.datatypes.Key = None, source: str = None, xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.ExtensibleFor describing lexical relations with a word not in the lexicon.
- Variables:
type (Key) – Gives the etymological relationship between this sense and some other word in another language.
source (str) – Gives the language for the source language of the etymological relation.
gloss_items (Optional[List[Gloss]]) – Gives glosses of the word that the etymological relationship is with.
form (Optional[Form]) – Holds the form of the etymological reference.
- XML_TAG = 'etymology'¶
- add_gloss(lang, text)¶
- set_form()¶
- class lift_utils.lexicon.Example(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.Multitext,lift_utils.base.ExtensibleGives an example sentence or phrase. It is given in the language and glosses of that example in other languages.
- Variables:
source (Optional[Key]) – Reference by which another application may refer to this example or is a reference into another database of texts, for example.
translation_items (Optional[List[Translation]]) – Gives translations of the example into different languages.
note_items (Optional[List[Note]]) – Holds notes on this example.
- XML_TAG = 'example'¶
- class lift_utils.lexicon.GrammaticalInfo(value: lift_utils.datatypes.Key = None, xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.LIFTUtilsBaseA reference to a
range-elementin thegrammatical-inforange.- Variables:
- XML_TAG = 'grammatical-info'¶
- class lift_utils.lexicon.Lexicon(path: pathlib.Path | str | None = None, version: str = None, xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.LIFTUtilsBaseThis is the main class of the lexicon. It contains the header and all the entries in the database.
- Variables:
version (str) – Specifies the lift language version number.
producer (Optional[str]) – Identifies the particular producer of this lift file.
header (Optional[Header]) – Contains the header information for the database.
entry_items (Optional[List[Entry]]) – Each of the entries in the lexicon.
path (Optional[Path]) – File path to a LIFT file to import.
- XML_TAG = 'lift'¶
- add_entry() Entry¶
Add an empty
Entryto the lexicon. Returns theEntryobject, which can then be used to add data to it.
- find(text: str, field: str = 'gloss', match_type: str = 'contains') Entry | Sense | None¶
Return the first matching
EntryorSenseitem. The field searched can be the entry’s “lexical-unit” or “variant” field, the sense’s “gloss” [default], “definition”, or “grammatical-info” field, as well as any fields defined in the LIFT file’s header.- Variables:
text (str) – The search term.
field (str) – The field to be searched [default is “gloss”].
match_type (str) – The kind comparison between the search term and the field’s data. Possible values are “contains” [default], “exact”, or “regex”.
- find_all(text: str = '', field: str = 'gloss', match_type: str = 'contains') List[Entry | Sense]¶
Return all matching
EntryorSenseitems. The field searched can be the entry’s “lexical-unit” or “variant” field, the sense’s “gloss” [default], “definition”, or “grammatical-info” field, as well as any fields defined in the LIFT file’s header.- Variables:
text (str) – The search term.
field (str) – The field to be searched [default is “gloss”].
match_type (str) – The kind comparison between the search term and the field’s data. Possible values are “contains” [default], “exact”, or “regex”.
- get_item_by_id(refid: str) Entry | Sense | None¶
Return an entry or sense by its
idattribute.- Variables:
refid (str) – The
idattribute of the entry or sense.
- get_range_elements(range_name)¶
Returns a generator object that lists all the names defined in the header for the given
range.- Variables:
range_name (str) – The name of the header range.
- get_ranges()¶
Returns a generator object that lists all the range names defined in the header.
- show()¶
Print an overview of the
Lexiconin the terminal window.
- to_lift(file_path: str)¶
Save the
Lexiconas a LIFT file. The LIFT-RANGES file will be automatically created in the same folder as the LIFT file.- Variables:
file_path (str) – Full or relative path to new LIFT file.
- class lift_utils.lexicon.Note(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.Multitext,lift_utils.base.ExtensibleFor storing descriptive information of many kinds. It can include comments, bibliographic information and domain specific notes.
- Variables:
type (Optional[Key]) – Gives the type of note by reference to a
range-elementin thenote-typerange.
- XML_TAG = 'note'¶
- class lift_utils.lexicon.Phonetic(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.Multitext,lift_utils.base.ExtensibleThis represents a single pronunciation in phonetic form.
- Variables:
- XML_TAG = 'pronunciation'¶
- add_form(lang, text)¶
- add_media(href=None, label=None)¶
- class lift_utils.lexicon.Relation(rel_type: lift_utils.datatypes.Key = None, ref: lift_utils.datatypes.RefId = None, xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.ExtensibleThis element is used for lexical relations.
- Variables:
type (Key) – Is the type of the particular lexical relation.
ref (RefId) – This is the other end of the relation, either a
senseor anentry.order (Optional[int]) – Gives the relative ordering of relations of a given type when a multiple relation is being described.
usage (Optional[Multitext]) – Gives information on usage in one or more languages or writing systems.
- XML_TAG = 'relation'¶
- class lift_utils.lexicon.Reversal(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.MultitextEnables a wider use of a dictionary.
- Variables:
type (Optional[Key]) – Gives the type of the reversal as a
range-elementin thereversal-typerange.main (Optional[Reversal]) – This gives the parent reversal in a hierarchical set of reversals.
grammatical_info (Optional[GrammaticalInfo]) – This allows a reversal relation to specify what the grammatical information is in the reversal language.
- XML_TAG = 'reversal'¶
- class lift_utils.lexicon.Sense(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.ExtensibleAn
entryis made up of a number ofsenseelements.- Variables:
id (Optional[RefId]) – This gives an identifier for this
Senseso that things can refer to it.order (Optional[int]) – A number that is used to give the relative order of senses within an entry.
grammatical_info (Optional[GrammaticalInfo]) – Grammatical information.
gloss_items (Optional[List[Union[Gloss, Form]]]) – Each
glossis a single string in a single language and writing system.Formis used by LIFT v0.13 (FieldWorks), whileGlossis used in later versions.definition (Optional[Multitext]) – Gives the definition in multiple languages or writing systems.
relation_items (Optional[List[Relation]]) – While a lexical relation isn’t strictly owned by a sense it is a good place to hold it.
note_items (Optional[List[Note]]) – There are lots of different types of notes.
example_items (Optional[List[Example]]) – Examples may be used for different target audiences.
reversal_items (Optional[List[Reversal]]) – There may be different reversal indexes.
illustration_items (Optional[List[URLRef]]) – The picture doesn’t have to be static.
subsense_items (Optional[List[Sense]]) – Senses can form a hierarchy.
- XML_TAG = 'sense'¶
- add_example() Example¶
Add an empty
Exampleitem to the sense. Returns the new object, which can then be used to add data to it.
- add_gloss(lang, text) lift_utils.base.Gloss¶
Add a
Glossitem to the sense. Returns the new object.- Variables:
lang (str) – The gloss’s language.
text (str) – The actual gloss text.
- add_illustration(href=None) lift_utils.base.URLRef¶
Add a
URLRefillustration item to the sense. Returns the new object, which can then be used to add data to it.
- add_note() Note¶
Add an empty
Noteitem to the sense. Returns the new object, which can then be used to add data to it.
- add_relation() Relation¶
Add an empty
Relationitem to the sense. Returns the new object, which can then be used to add data to it.
- add_reversal() Reversal¶
Add an empty
Reversalitem to the sense. Returns the new object, which can then be used to add data to it.
- add_subsense()¶
Add an empty
Subenseitem to the sense. Returns the new object, which can then be used to add data to it.
- get_gloss(lang='en') str¶
Get the gloss for a given language. Defaults to English; falls back otherwise to the first gloss.
- get_grammatical_info() GrammaticalInfo¶
Return the grammatical-info part of speech value.
- get_id() lift_utils.datatypes.RefId¶
Return the object’s
idattribute.
- set_definition(forms_dict=None)¶
Set the sense’s definition.
- Variables:
forms_dict (Optional[dict]) –
dictkeys are language codes, values are the text for each definition.
- set_grammatical_info(value: str)¶
Set the sense’s
GrammaticalInfo.- Variables:
value (str) – The part of speech tag in the
grammatical-inforange.
- class lift_utils.lexicon.Translation(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.MultitextA
Multitextwith an optional translationtypeattribute.- Variables:
type (Optional[Key]) – Gives the type of the translation.
- XML_TAG = 'translation'¶
- class lift_utils.lexicon.Variant(xml_tree: lxml.etree._Element | None = None, **kwargs)¶
Bases:
lift_utils.base.Multitext,lift_utils.base.ExtensibleVariantelements are used for all sorts of variation.- Variables:
ref (Optional[RefId]) – Gives the variation as a reference to another
entryorsenserather than specifying theform(that is, theMultitextvalue of the variant).pronunciation_items (Optional[List[Phonetic]]) – Holds the phonetic variant whether it is that this is a variation in phonetics only or that the phonetic variation arises because of an orthographic or phonemic variation.
relation_items (Optional[List[Relation]]) – Some variants have a lexical relationship with other senses or entries in the lexicon.
- XML_TAG = 'variant'¶