Treebank Statistics: UD_Belarusian-HSE: POS Tags: NUM
There are 957 NUM
lemmas (3%), 1008 NUM
types (2%) and 5846 NUM
tokens (2%).
Out of 17 observed tags, the rank of NUM
is: 7 in number of lemmas, 7 in number of types and 12 in number of tokens.
The 10 most frequent NUM
lemmas: два, адзін, тры, 10, 2, некалькі, 5, 1, 20, 3
The 10 most frequent NUM
types: 10, 2, 5, некалькі, два, 1, тры, 20, 3, адзін
The 10 most frequent ambiguous lemmas: адзін (DET 352, NUM 206), 10 (NUM 175, ADJ 49), 2 (NUM 168, ADJ 37, PROPN 1), 5 (NUM 143, ADJ 24, PROPN 1), 1 (NUM 133, ADJ 66), 20 (NUM 125, ADJ 54), 3 (NUM 122, ADJ 62, X 1), 100 (NUM 100, ADJ 1), колькі (NUM 100, CCONJ 1), 15 (NUM 97, ADJ 36)
The 10 most frequent ambiguous types: 10 (NUM 175, ADJ 49), 2 (NUM 165, ADJ 37, PROPN 1), 5 (NUM 140, ADJ 24, PROPN 1), 1 (NUM 133, ADJ 66), 20 (NUM 125, ADJ 52), 3 (NUM 121, ADJ 62, ADP 4, X 1), адзін (NUM 92, DET 72), 100 (NUM 100, ADJ 1), колькі (NUM 63, CCONJ 1), 15 (NUM 96, ADJ 36)
- 10
- 2
- 5
- 1
- 20
- 3
- адзін
- 100
- колькі
- 15
Morphology
The form / lemma ratio of NUM
is 1.053292 (the average of all parts of speech is 1.756638).
The 1st highest number of forms (11) was observed with the lemma “два”: два, две, двум, двума, двух, дзве, дзвюмя, дзвюх, дзьве, дзьвюма, дзьвюх.
The 2nd highest number of forms (8) was observed with the lemma “адзін”: адзін, адна, аднаго, адно, адной, адну, адны, адным.
The 3rd highest number of forms (4) was observed with the lemma “абодва”: абедзвюх, абедзьве, абодва, абодвух.
NUM
occurs with 5 features: NumType (4790; 82% instances), Case (1322; 23% instances), Animacy (534; 9% instances), Gender (516; 9% instances), Number (217; 4% instances)
NUM
occurs with 15 feature-value pairs: Animacy=Anim
, Animacy=Inan
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Gender=Fem
, Gender=Masc
, Gender=Neut
, NumType=Card
, NumType=Sets
, Number=Plur
, Number=Sing
NUM
occurs with 68 feature combinations.
The most frequent feature combination is NumType=Card
(3738 tokens).
Examples: 10, 2, 5, 1, 20, 3, 100, 15, 18, 7
Relations
NUM
nodes are attached to their parents using 24 different relations: nummod:gov (1926; 33% instances), nummod (1446; 25% instances), list (690; 12% instances), appos (446; 8% instances), nmod (370; 6% instances), root (283; 5% instances), parataxis (256; 4% instances), obl (152; 3% instances), conj (98; 2% instances), nsubj (70; 1% instances), obj (32; 1% instances), compound (26; 0% instances), orphan (11; 0% instances), amod (8; 0% instances), flat (6; 0% instances), nsubj:pass (6; 0% instances), fixed (5; 0% instances), ccomp (4; 0% instances), dep (3; 0% instances), advcl (2; 0% instances), iobj (2; 0% instances), xcomp (2; 0% instances), acl (1; 0% instances), acl:relcl (1; 0% instances)
Parents of NUM
nodes belong to 13 different parts of speech: NOUN (3553; 61% instances), NUM (540; 9% instances), VERB (407; 7% instances), ADJ (288; 5% instances), PROPN (284; 5% instances), (283; 5% instances), SYM (282; 5% instances), X (129; 2% instances), ADV (47; 1% instances), PRON (16; 0% instances), DET (9; 0% instances), PART (6; 0% instances), ADP (2; 0% instances)
3746 (64%) NUM
nodes are leaves.
1447 (25%) NUM
nodes have one child.
310 (5%) NUM
nodes have two children.
343 (6%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 12.
Children of NUM
nodes are attached using 29 different relations: punct (1213; 35% instances), case (413; 12% instances), list (386; 11% instances), advmod (382; 11% instances), nmod (320; 9% instances), nsubj (111; 3% instances), parataxis (105; 3% instances), conj (100; 3% instances), dep (99; 3% instances), compound (97; 3% instances), cc (44; 1% instances), amod (30; 1% instances), flat (24; 1% instances), obl (22; 1% instances), det (17; 0% instances), cop (16; 0% instances), appos (9; 0% instances), nummod (9; 0% instances), orphan (7; 0% instances), advcl (6; 0% instances), discourse (5; 0% instances), mark (5; 0% instances), nummod:gov (4; 0% instances), iobj (3; 0% instances), acl (2; 0% instances), fixed (2; 0% instances), acl:relcl (1; 0% instances), dislocated (1; 0% instances), nsubj:pass (1; 0% instances)
Children of NUM
nodes belong to 16 different parts of speech: PUNCT (1213; 35% instances), NUM (540; 16% instances), ADP (389; 11% instances), ADV (333; 10% instances), NOUN (325; 9% instances), X (139; 4% instances), SYM (134; 4% instances), ADJ (92; 3% instances), PART (59; 2% instances), PROPN (57; 2% instances), CCONJ (43; 1% instances), VERB (30; 1% instances), PRON (25; 1% instances), DET (21; 1% instances), AUX (17; 0% instances), SCONJ (17; 0% instances)