home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Turkish-BOUN: POS Tags: NUM

There are 463 NUM lemmas (3%), 584 NUM types (2%) and 2606 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 6 in number of lemmas, 6 in number of types and 11 in number of tokens.

The 10 most frequent NUM lemmas: iki, 1, bir, üç, 2, yüz, 3, bin, 4, beş

The 10 most frequent NUM types: 1, iki, 2, üç, bir, 3, yüzde, bin, 4, milyon

The 10 most frequent ambiguous lemmas: iki (NUM 252, NOUN 6, ADV 5, PROPN 1), bir (DET 2587, NUM 141, ADV 59, NOUN 6, PRON 5, CCONJ 2, PART 1), üç (NUM 131, NOUN 2, ADJ 1), 2 (NUM 116, NOUN 2), yüz (NOUN 107, NUM 90, VERB 6, ADV 1), 3 (NUM 83, NOUN 1, PROPN 1), bin (NUM 66, VERB 19, NOUN 7, PROPN 1), beş (NUM 53, NOUN 2), milyon (NUM 51, NOUN 6, PROPN 1), 5 (NUM 47, NOUN 1)

The 10 most frequent ambiguous types: iki (NUM 144, ADV 1, NOUN 1), 2 (NUM 112, NOUN 1), bir (DET 2373, NUM 83, ADV 38, CCONJ 2, NOUN 1, PART 1), on (NUM 34, NOUN 1), 6 (NUM 35, NOUN 1), kaç (NUM 26, DET 1, PRON 1), İki (NUM 22, ADV 1), birer (NUM 18, ADV 1), yüz (NUM 19, NOUN 6, AUX 2), 7 (NUM 19, NOUN 1)

Morphology

The form / lemma ratio of NUM is 1.261339 (the average of all parts of speech is 2.412899).

The 1st highest number of forms (14) was observed with the lemma “iki”: iki, ikimiz, ikinci, ikisi, ikisini, ikisinin, ikiye, ikişer, İki, İkimiz, İkinci, İkinin, İkisi, İkisini.

The 2nd highest number of forms (7) was observed with the lemma “bir”: Birden, Biri, bir, bire, birer, birinci, birini.

The 3rd highest number of forms (7) was observed with the lemma “biri”: biri, biridir, birimiz, birinci, birinde, birine, birini.

NUM occurs with 7 features: NumType (2253; 86% instances), Number (679; 26% instances), Person (679; 26% instances), Case (676; 26% instances), Number[psor] (106; 4% instances), Person[psor] (106; 4% instances), PronType (2; 0% instances)

NUM occurs with 20 feature-value pairs: Case=Abl, Case=Acc, Case=Dat, Case=Equ, Case=Gen, Case=Ins, Case=Loc, Case=Nom, NumType=Card, NumType=Dist, NumType=Ord, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=3, Person[psor]=1, Person[psor]=3, PronType=Dem

NUM occurs with 50 feature combinations. The most frequent feature combination is NumType=Card (1830 tokens). Examples: 1, iki, üç, 2, bir, bin, 4, 3, milyon, 5

Relations

NUM nodes are attached to their parents using 24 different relations: nummod (1547; 59% instances), flat (340; 13% instances), obl (188; 7% instances), conj (99; 4% instances), amod (97; 4% instances), nmod:poss (86; 3% instances), root (46; 2% instances), nsubj (44; 2% instances), list (39; 1% instances), obj (38; 1% instances), compound (17; 1% instances), obl:tmod (16; 1% instances), nmod (9; 0% instances), parataxis (7; 0% instances), advcl (6; 0% instances), orphan (6; 0% instances), appos (5; 0% instances), compound:redup (5; 0% instances), dep (3; 0% instances), iobj (3; 0% instances), acl (2; 0% instances), ccomp (1; 0% instances), discourse (1; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 12 different parts of speech: NOUN (1576; 60% instances), NUM (465; 18% instances), VERB (275; 11% instances), ADJ (103; 4% instances), PROPN (91; 3% instances), (46; 2% instances), ADV (33; 1% instances), DET (6; 0% instances), PRON (5; 0% instances), ADP (4; 0% instances), AUX (1; 0% instances), CCONJ (1; 0% instances)

1845 (71%) NUM nodes are leaves.

499 (19%) NUM nodes have one child.

176 (7%) NUM nodes have two children.

86 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 32 different relations: flat (358; 31% instances), punct (266; 23% instances), conj (80; 7% instances), nummod (53; 5% instances), obl (45; 4% instances), amod (40; 3% instances), compound (40; 3% instances), nmod:poss (35; 3% instances), advmod:emph (29; 3% instances), nsubj (29; 3% instances), advmod (24; 2% instances), case (22; 2% instances), det (20; 2% instances), nmod:part (20; 2% instances), cc (18; 2% instances), list (12; 1% instances), nmod (11; 1% instances), acl (8; 1% instances), cop (6; 1% instances), orphan (6; 1% instances), compound:redup (5; 0% instances), advcl (4; 0% instances), aux (3; 0% instances), compound:lvc (3; 0% instances), fixed (3; 0% instances), obl:tmod (3; 0% instances), appos (2; 0% instances), cc:preconj (2; 0% instances), dep:der (2; 0% instances), obj (2; 0% instances), ccomp (1; 0% instances), parataxis (1; 0% instances)

Children of NUM nodes belong to 13 different parts of speech: NUM (465; 40% instances), PUNCT (266; 23% instances), NOUN (183; 16% instances), ADJ (40; 3% instances), VERB (34; 3% instances), CCONJ (32; 3% instances), ADV (30; 3% instances), PART (28; 2% instances), PROPN (27; 2% instances), DET (21; 2% instances), ADP (12; 1% instances), AUX (9; 1% instances), PRON (6; 1% instances)