Remark that if a tier already exists with the same label as one of those you just enterede
during the configure interlinear tiers process, a new tier will be created with this label
ended by –cp, avoiding the loss of the original one. If you want to overwrite that existing
tier, you should delete it beforehand.
Principles of annotation into ELAN
There are three kinds of entries (called here Lexeme) into the ELAN lexicon:
Lemma (base form chosen to represent the various forms of a word in context) – which
may present alternate (contextual) forms known herein as variants,
Stem, which is a form which cannot appear on its own as a word; it needs a
complementary affix. A stem may present a symbol (e.g _ ) to its left or right (or both) to
distinguish it from a lemma if desirable,
or an Affix. Affixes represent all the morphemes that can be agglutinated to a lemma, a
stem or another affix. By default, the affixes present a hyphen (-) to the left or to the right
if they are respectively suffixes or prefixes. Clitics can be distinguished by the use of an
equal sign (=) to the left or right, reduplication can also be represented by a tilde ~ at
the beginning of the segment (cf. parameters)
Lookup at the words in the lexicon
The principle of the ELAN-CorpA annotation is, as a first step, to try and match the
current word with the lemma or stems of the lexicon, or with their alternate forms
(variants). If the word is found in the lexicon, the value of the fields Lexeme, Gloss and Tier
X of the entry goes to the corresponding mb, ge and rx tiers under the current word in the
annotation area. Notice that if the word corresponds to a variant of a lexeme, it is the
underlying lexeme value that shows in the mb tier.!
Now as a second step, if the word is not found, the parser tries to segment it using the
affixes of the lexicon.
Segmentation
When a word is not found in the lexicon, the parsing process takes place, trying to match
all the affixes (prefixes, suffixes, clitics, reduplications...) of the lexicon to the end and/or
beginning of the word. When an affix matches, the parser isolates the affix, and the rest of
the word is, in turn, searched in the lexicon, and so on. If the rest is not found, an asterisk
will precede it, meaning it is a possible new entry. All the combinations are explored and
the various segmentations are displayed in the Segmentations section. At this stage, to
parse a new word, you should start by entering its affixes.
Comentarios a estos manuales