Treebank Statistics: UD_Belarusian-HSE: POS Tags: NOUN
There are 9131 NOUN
lemmas (31%), 18560 NOUN
types (35%) and 72686 NOUN
tokens (24%).
Out of 17 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: год, чалавек, дзень, час, мова, беларус, гурт, сядзіба, варта, арт
The 10 most frequent NOUN
types: дзень, людзей, чалавек, арт, годзе, гадоў, час, людзі, года, год
The 10 most frequent ambiguous lemmas: год (NOUN 1642, PRON 1), варта (NOUN 315, VERB 61), справа (NOUN 264, ADV 5), тысяча (NOUN 210, ADV 1), раз (NOUN 176, SCONJ 1), частка (NOUN 144, ADV 2), жанчына (NOUN 136, ADJ 1), клуб (NOUN 97, SCONJ 1), праўда (NOUN 80, ADV 1), верш (NOUN 68, ADV 2)
The 10 most frequent ambiguous types: час (NOUN 242, X 1), г. (NOUN 150, ADV 18, ADJ 1, PRON 1), варта (NOUN 58, VERB 52), свабода (NOUN 8, PROPN 1), раз (NOUN 84, SCONJ 1), варты (NOUN 51, ADJ 3), BYN (NOUN 36, X 1), справа (NOUN 30, ADV 2), ахвяраў (NOUN 32, VERB 1), дома (ADV 39, NOUN 24)
- час
- г.
- варта
- свабода
- раз
- варты
- BYN
- справа
- ахвяраў
- дома
Morphology
The form / lemma ratio of NOUN
is 2.032636 (the average of all parts of speech is 1.756638).
The 1st highest number of forms (19) was observed with the lemma “чалавек”: людей, людзi, людзей, людзмі, людзт, людзьмі, людзям, людзямі, людзях, людзі, чал., чалаве, чалавек, чалавека, чалавекам, чалавекаў, чалавеку, чалавекі, чалавеча.
The 2nd highest number of forms (18) was observed with the lemma “год”: г, г., гадамі, гадах, гадоу, гадох, гадоў, гады, гг, гг., го, год, года, годам, годдзе, годзе, году, годы.
The 3rd highest number of forms (16) was observed with the lemma “улада”: улада, уладай, уладам, уладамі, уладаў, уладзе, уладу, улады, ўлад, ўлада, ўладай, ўладамі, ўладаў, ўладзе, ўладу, ўлады.
NOUN
occurs with 15 features: Number (71396; 98% instances), Case (71395; 98% instances), Gender (71386; 98% instances), Animacy (71384; 98% instances), Abbr (1073; 1% instances), Foreign (194; 0% instances), Typo (42; 0% instances), Degree (12; 0% instances), InflClass (4; 0% instances), Person (2; 0% instances), Aspect (1; 0% instances), Mood (1; 0% instances), Tense (1; 0% instances), VerbForm (1; 0% instances), Voice (1; 0% instances)
NOUN
occurs with 25 feature-value pairs: Abbr=Yes
, Animacy=Anim
, Animacy=Inan
, Aspect=Imp
, Case=Acc
, Case=Dat
, Case=Gen
, Case=Ins
, Case=Loc
, Case=Nom
, Case=Voc
, Degree=Pos
, Foreign=Yes
, Gender=Fem
, Gender=Masc
, Gender=Neut
, InflClass=Ind
, Mood=Ind
, Number=Plur
, Number=Sing
, Person=3
, Tense=Past
, Typo=Yes
, VerbForm=Fin
, Voice=Act
NOUN
occurs with 110 feature combinations.
The most frequent feature combination is Animacy=Inan|Case=Gen|Gender=Masc|Number=Sing
(6278 tokens).
Examples: года, году, сакавіка, красавіка, лістапада, траўня, гурта, жніўня, дня, часу
Relations
NOUN
nodes are attached to their parents using 32 different relations: nmod (20088; 28% instances), obl (13490; 19% instances), nsubj (10674; 15% instances), obj (8763; 12% instances), conj (5768; 8% instances), root (5449; 7% instances), appos (1625; 2% instances), flat (1285; 2% instances), iobj (1261; 2% instances), parataxis (1081; 1% instances), nsubj:pass (781; 1% instances), xcomp (499; 1% instances), compound (399; 1% instances), list (343; 0% instances), obl:agent (243; 0% instances), fixed (207; 0% instances), nummod (146; 0% instances), orphan (132; 0% instances), vocative (120; 0% instances), ccomp (107; 0% instances), advcl (71; 0% instances), acl:relcl (41; 0% instances), acl (29; 0% instances), nummod:gov (29; 0% instances), csubj (23; 0% instances), flat:foreign (8; 0% instances), nsubj:outer (7; 0% instances), dep (5; 0% instances), dislocated (4; 0% instances), discourse (3; 0% instances), flat:name (3; 0% instances), reparandum (2; 0% instances)
Parents of NOUN
nodes belong to 18 different parts of speech: VERB (33074; 46% instances), NOUN (27239; 37% instances), (5449; 7% instances), ADJ (3071; 4% instances), PROPN (1222; 2% instances), ADV (813; 1% instances), PRON (451; 1% instances), DET (328; 0% instances), NUM (325; 0% instances), X (266; 0% instances), ADP (183; 0% instances), SYM (138; 0% instances), AUX (61; 0% instances), PART (34; 0% instances), SCONJ (18; 0% instances), INTJ (9; 0% instances), CCONJ (3; 0% instances), PUNCT (2; 0% instances)
13391 (18%) NOUN
nodes are leaves.
24638 (34%) NOUN
nodes have one child.
20418 (28%) NOUN
nodes have two children.
14239 (20%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 13.
Children of NOUN
nodes are attached using 41 different relations: nmod (25592; 22% instances), case (21422; 18% instances), amod (20479; 17% instances), punct (16427; 14% instances), conj (5590; 5% instances), appos (5032; 4% instances), det (5019; 4% instances), cc (3494; 3% instances), parataxis (2218; 2% instances), advmod (1917; 2% instances), acl:relcl (1830; 2% instances), nummod:gov (1717; 1% instances), nsubj (1709; 1% instances), nummod (1411; 1% instances), acl (817; 1% instances), dep (647; 1% instances), compound (413; 0% instances), cop (400; 0% instances), list (397; 0% instances), mark (256; 0% instances), obl (209; 0% instances), iobj (203; 0% instances), expl (138; 0% instances), orphan (135; 0% instances), advcl (73; 0% instances), discourse (67; 0% instances), csubj (60; 0% instances), flat:foreign (51; 0% instances), vocative (43; 0% instances), flat (9; 0% instances), aux (8; 0% instances), flat:name (5; 0% instances), ccomp (4; 0% instances), dislocated (3; 0% instances), reparandum (3; 0% instances), aux:pass (2; 0% instances), nsubj:outer (2; 0% instances), obj (2; 0% instances), xcomp (2; 0% instances), fixed (1; 0% instances), goeswith (1; 0% instances)
Children of NOUN
nodes belong to 17 different parts of speech: NOUN (27239; 23% instances), ADJ (21258; 18% instances), ADP (21191; 18% instances), PUNCT (16427; 14% instances), PROPN (8369; 7% instances), DET (5178; 4% instances), VERB (3943; 3% instances), NUM (3553; 3% instances), CCONJ (3436; 3% instances), X (2370; 2% instances), ADV (1377; 1% instances), PRON (987; 1% instances), PART (976; 1% instances), SYM (593; 1% instances), SCONJ (468; 0% instances), AUX (421; 0% instances), INTJ (22; 0% instances)