Treebank Statistics: UD_Kazakh-KTB: POS Tags: NUM
There are 165 NUM
lemmas (6%), 188 NUM
types (4%) and 374 NUM
tokens (4%).
Out of 17 observed tags, the rank of NUM
is: 5 in number of lemmas, 5 in number of types and 7 in number of tokens.
The 10 most frequent NUM
lemmas: бір, екі, үш, миллиард, 1, 12, екеу, 2, 20, 11
The 10 most frequent NUM
types: бір, екі, бірі, миллиард, 1, 12, 2, 11, 20, үш
The 10 most frequent ambiguous lemmas: бір (NUM 44, DET 20, PRON 1, X 1), миллион (NUM 4, NOUN 2), млн. (NOUN 3, NUM 2), біреу (PRON 5, NUM 1)
The 10 most frequent ambiguous types: бір (NUM 21, DET 16, X 1), миллион (NUM 3, NOUN 1), бірдей (NUM 2, ADJ 1), млн. (NOUN 3, NUM 2), миллионнан (NOUN 1, NUM 1)
- бір
- миллион
- бірдей
- млн.
- миллионнан
Morphology
The form / lemma ratio of NUM
is 1.139394 (the average of all parts of speech is 1.747153).
The 1st highest number of forms (6) was observed with the lemma “бір”: Бірін, Бірінші, бір, бірдей, бірі, бірінен.
The 2nd highest number of forms (3) was observed with the lemma “30”: 30, 30%, 30-шы.
The 3rd highest number of forms (3) was observed with the lemma “жеті”: жеті, жетінші, жетіңіз.
NUM
occurs with 5 features: NumType (374; 100% instances), Case (46; 12% instances), Number[psor] (28; 7% instances), Person[psor] (28; 7% instances), Polite (1; 0% instances)
NUM
occurs with 15 feature-value pairs: Case=Abl
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Loc
, Case=Nom
, NumType=Card
, NumType=Card,Ord
, NumType=Coll
, NumType=Ord
, Number[psor]=Plur,Sing
, Number[psor]=Sing
, Person[psor]=2
, Person[psor]=3
, Polite=Form
NUM
occurs with 18 feature combinations.
The most frequent feature combination is NumType=Card
(149 tokens).
Examples: бір, екі, миллиард, үш, 12, төрт, сегіз, 1, 4, 5
Relations
NUM
nodes are attached to their parents using 14 different relations: amod (140; 37% instances), nummod (110; 29% instances), compound (37; 10% instances), appos (15; 4% instances), nsubj (15; 4% instances), conj (12; 3% instances), root (11; 3% instances), nmod (10; 3% instances), obl (10; 3% instances), nmod:poss (4; 1% instances), advcl (3; 1% instances), obj (3; 1% instances), orphan (2; 1% instances), parataxis (2; 1% instances)
Parents of NUM
nodes belong to 7 different parts of speech: NOUN (280; 75% instances), NUM (44; 12% instances), VERB (20; 5% instances), ADJ (11; 3% instances), (11; 3% instances), PROPN (6; 2% instances), PRON (2; 1% instances)
242 (65%) NUM
nodes are leaves.
94 (25%) NUM
nodes have one child.
22 (6%) NUM
nodes have two children.
16 (4%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 5.
Children of NUM
nodes are attached using 20 different relations: punct (83; 42% instances), compound (24; 12% instances), nmod:poss (15; 8% instances), conj (14; 7% instances), case (9; 5% instances), nsubj (9; 5% instances), nummod (6; 3% instances), obl (6; 3% instances), advmod (5; 3% instances), flat:name (5; 3% instances), cop (4; 2% instances), clf (3; 2% instances), nmod (3; 2% instances), advcl (2; 1% instances), cc (2; 1% instances), parataxis (2; 1% instances), amod (1; 1% instances), appos (1; 1% instances), dep (1; 1% instances), orphan (1; 1% instances)
Children of NUM
nodes belong to 14 different parts of speech: PUNCT (83; 42% instances), NUM (44; 22% instances), NOUN (35; 18% instances), ADP (7; 4% instances), ADV (5; 3% instances), PROPN (5; 3% instances), AUX (4; 2% instances), VERB (4; 2% instances), ADJ (2; 1% instances), PRON (2; 1% instances), SYM (2; 1% instances), CCONJ (1; 1% instances), SCONJ (1; 1% instances), X (1; 1% instances)