home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Lithuanian-ALKSNIS: POS Tags: NUM

There are 414 NUM lemmas (5%), 513 NUM types (3%) and 1699 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 10 in number of tokens.

The 10 most frequent NUM lemmas: 1, 2, 3, pirmas, 2006, du, 4, 5, vienas, antras

The 10 most frequent NUM types: 1, 2, 3, 2006, 4, 5, 6, 25, 7, 10

The 10 most frequent ambiguous lemmas: pirmas (NUM 57, ADJ 2), vienas (PRON 115, NUM 38, ADJ 7, X 6), 22 (NUM 10, X 1), abu (NUM 9, PRON 7)

The 10 most frequent ambiguous types: vieną (NUM 18, PRON 14, ADJ 2), 22 (NUM 10, X 1), I (NUM 7, X 5), viena (PRON 13, NUM 5, ADV 2, X 1), vieno (PRON 6, NUM 5, X 2), V (X 19, NUM 3), pirma (X 11, NUM 1), vienas (PRON 29, NUM 3, ADJ 2, X 2), abiem (NUM 2, PRON 1), pirmieji (NUM 2, ADJ 1)

Morphology

The form / lemma ratio of NUM is 1.239130 (the average of all parts of speech is 2.065341).

The 1st highest number of forms (21) was observed with the lemma “pirmas”: Pirmajai, Pirmuose, pirma, pirmaisiais, pirmajame, pirmame, pirmas, pirmasis, pirmi, pirmieji, pirmo, pirmoji, pirmojo, pirmojoje, pirmos, pirmosios, pirmuosiuose, pirmą, pirmąją, pirmąjį, pirmųjų.

The 2nd highest number of forms (15) was observed with the lemma “antras”: Antraisiais, Antrame, Antrosios, antra, antrajame, antram, antrasis, antro, antroje, antroji, antrojo, antrojoje, antros, antrą, antrąjį.

The 3rd highest number of forms (10) was observed with the lemma “trečias”: Trečiame, Trečioji, tretiesiems, trečia, trečioje, trečiojo, trečios, trečiosiomis, trečiosios, trečiąjį.

NUM occurs with 7 features: NumForm (1688; 99% instances), Definite (1494; 88% instances), NumType (350; 21% instances), Gender (338; 20% instances), Case (333; 20% instances), Number (201; 12% instances), Hyph (3; 0% instances)

NUM occurs with 22 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Definite=Def, Definite=Ind, Gender=Fem, Gender=Masc, Gender=Neut, Hyph=Yes, NumForm=Combi, NumForm=Digit, NumForm=Roman, NumForm=Word, NumType=Card, NumType=Mult, NumType=Ord, NumType=Sets, Number=Plur, Number=Sing

NUM occurs with 88 feature combinations. The most frequent feature combination is Definite=Ind|NumForm=Digit (1283 tokens). Examples: 1, 2, 3, 2006, 4, 5, 6, 25, 7, 10

Relations

NUM nodes are attached to their parents using 14 different relations: nummod (1226; 72% instances), obl (178; 10% instances), conj (132; 8% instances), nummod:gov (51; 3% instances), root (39; 2% instances), dep (21; 1% instances), nsubj (14; 1% instances), obl:arg (10; 1% instances), parataxis (10; 1% instances), appos (7; 0% instances), obj (5; 0% instances), orphan (3; 0% instances), compound (2; 0% instances), nsubj:pass (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (804; 47% instances), VERB (402; 24% instances), X (279; 16% instances), NUM (108; 6% instances), (39; 2% instances), ADJ (21; 1% instances), ADV (16; 1% instances), SYM (13; 1% instances), PRON (10; 1% instances), PROPN (5; 0% instances), DET (2; 0% instances)

867 (51%) NUM nodes are leaves.

644 (38%) NUM nodes have one child.

117 (7%) NUM nodes have two children.

71 (4%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 23 different relations: punct (567; 50% instances), nmod (161; 14% instances), conj (152; 13% instances), case (85; 7% instances), advmod:emph (48; 4% instances), cc (30; 3% instances), obl:arg (19; 2% instances), nummod (15; 1% instances), advmod (14; 1% instances), mark (13; 1% instances), appos (11; 1% instances), obl (6; 1% instances), nsubj (5; 0% instances), amod (3; 0% instances), orphan (3; 0% instances), advcl (2; 0% instances), compound (2; 0% instances), discourse (2; 0% instances), acl:relcl (1; 0% instances), ccomp (1; 0% instances), csubj (1; 0% instances), det (1; 0% instances), parataxis (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: PUNCT (567; 50% instances), NOUN (175; 15% instances), NUM (108; 9% instances), ADP (85; 7% instances), X (67; 6% instances), PART (47; 4% instances), CCONJ (30; 3% instances), ADV (17; 1% instances), SCONJ (13; 1% instances), VERB (13; 1% instances), ADJ (6; 1% instances), PRON (5; 0% instances), SYM (5; 0% instances), DET (3; 0% instances), INTJ (2; 0% instances)