Treebank Statistics: UD_Old_East_Slavic-RNC: POS Tags: NUM
There are 390 NUM
lemmas (3%), 678 NUM
types (2%) and 3813 NUM
tokens (2%).
Out of 17 observed tags, the rank of NUM
is: 6 in number of lemmas, 7 in number of types and 11 in number of tokens.
The 10 most frequent NUM
lemmas: два, 3, 2, 4, одинъ, трие, 5, 10, 6, четыре
The 10 most frequent NUM
types: 3, 2, 4, два, 5, три, 10, 6, две, один
The 10 most frequent ambiguous lemmas: 3 (NUM 254, ADJ 22, ADV 5), 2 (NUM 205, ADJ 22), 4 (NUM 159, ADJ 21), одинъ (NUM 149, DET 7, ADJ 3), 5 (NUM 120, ADJ 19), 10 (NUM 98, ADJ 17), 6 (NUM 94, ADJ 12), 7 (NUM 74, ADJ 14), 8 (NUM 70, ADJ 19), 9 (NUM 61, ADJ 8)
The 10 most frequent ambiguous types: 3 (NUM 240, ADJ 22, ADV 6), 2 (NUM 201, ADJ 22), 4 (NUM 158, ADJ 21), 5 (NUM 115, ADJ 18), 10 (NUM 92, ADJ 16), 6 (NUM 91, ADJ 11), 7 (NUM 71, ADJ 14), 8 (NUM 69, ADJ 18), 9 (NUM 59, ADJ 8), 12 (NUM 55, ADJ 15)
- 3
- 2
- 4
- 5
- 10
- 6
- 7
- 8
- 9
- 12
Morphology
The form / lemma ratio of NUM
is 1.738462 (the average of all parts of speech is 2.481645).
The 1st highest number of forms (27) was observed with the lemma “одинъ”: адин, адинъ, адна, один, одиного, одиною, одинъ, одна, однем, однеми, одно, однова, одново, одного, одное, однои, одном, одномъ, одною, одну, одъну, отнех, ъднои, ѡдин, ѡдиног[о], ѡдинъ, ѡдинѡг[о].
The 2nd highest number of forms (21) was observed with the lemma “два”: [д]ве, д[ва], дв[а], дв[е], два, две, двема, дви, двома, двою, дву, двум, двумъ, двумя, двух, двухъ, двѣ, двѣма, двѣмъ, двѣмя, двꙋ.
The 3rd highest number of forms (17) was observed with the lemma “оба”: [о]беих, Обоево, оба, обе, обеим, обеих, обоеꙗ, обоим, обоимъ, обоих, обоихъ, обою, обу, обѣ, обѣих, обѣихъ, ѡбѣ.
NUM
occurs with 8 features: NumForm (3813; 100% instances), NumType (3813; 100% instances), Case (3808; 100% instances), Gender (1512; 40% instances), Number (267; 7% instances), Animacy (21; 1% instances), Degree (5; 0% instances), Variant (1; 0% instances)
NUM
occurs with 22 feature-value pairs: Animacy=Anim
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Degree=Cmp
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NumForm=Combi
, NumForm=Cyril
, NumForm=Digit
, NumForm=Roman
, NumForm=Word
, NumType=Card
, NumType=Frac
, NumType=Sets
, Number=Plur
, Number=Sing
, Variant=Short
NUM
occurs with 115 feature combinations.
The most frequent feature combination is Case=Nom|NumForm=Digit|NumType=Card
(1392 tokens).
Examples: 3, 5, 10, 6, 8, 4, 12, 9, 7, 2
Relations
NUM
nodes are attached to their parents using 26 different relations: nummod:gov (2553; 67% instances), nummod (580; 15% instances), conj (227; 6% instances), root (125; 3% instances), compound (85; 2% instances), nsubj (60; 2% instances), appos (44; 1% instances), obl (29; 1% instances), obj (22; 1% instances), nmod (17; 0% instances), parataxis (16; 0% instances), nsubj:pass (8; 0% instances), obl:float (7; 0% instances), advcl (6; 0% instances), flat (6; 0% instances), list (6; 0% instances), orphan (5; 0% instances), acl (3; 0% instances), amod (3; 0% instances), iobj (3; 0% instances), dep (2; 0% instances), xcomp (2; 0% instances), csubj (1; 0% instances), dislocated (1; 0% instances), fixed (1; 0% instances), obl:depict (1; 0% instances)
Parents of NUM
nodes belong to 12 different parts of speech: NOUN (3192; 84% instances), NUM (302; 8% instances), (125; 3% instances), VERB (98; 3% instances), ADJ (50; 1% instances), PRON (18; 0% instances), ADV (9; 0% instances), PROPN (8; 0% instances), X (5; 0% instances), DET (4; 0% instances), ADP (1; 0% instances), PART (1; 0% instances)
3334 (87%) NUM
nodes are leaves.
274 (7%) NUM
nodes have one child.
92 (2%) NUM
nodes have two children.
113 (3%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 12.
Children of NUM
nodes are attached using 28 different relations: conj (208; 23% instances), punct (155; 17% instances), case (115; 13% instances), cc (83; 9% instances), compound (66; 7% instances), advmod (63; 7% instances), nsubj (54; 6% instances), nmod (45; 5% instances), list (23; 3% instances), obl (15; 2% instances), nummod:gov (13; 1% instances), cop (9; 1% instances), mark (8; 1% instances), appos (7; 1% instances), flat (6; 1% instances), nummod (5; 1% instances), parataxis (5; 1% instances), orphan (4; 0% instances), vocative (4; 0% instances), advcl (3; 0% instances), iobj (3; 0% instances), obl:pronmod (3; 0% instances), amod (2; 0% instances), det (2; 0% instances), parataxis:discourse (2; 0% instances), acl:relcl (1; 0% instances), dep (1; 0% instances), fixed (1; 0% instances)
Children of NUM
nodes belong to 15 different parts of speech: NUM (302; 33% instances), PUNCT (155; 17% instances), ADP (115; 13% instances), NOUN (103; 11% instances), CCONJ (81; 9% instances), ADV (53; 6% instances), PART (22; 2% instances), ADJ (15; 2% instances), VERB (14; 2% instances), X (11; 1% instances), AUX (10; 1% instances), PRON (9; 1% instances), SCONJ (7; 1% instances), DET (6; 1% instances), PROPN (3; 0% instances)