Treebank Statistics: UD_Erzya-JR: POS Tags: NOUN
There are 1325 NOUN
lemmas (40%), 2922 NOUN
types (43%) and 5092 NOUN
tokens (25%).
Out of 16 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: ланго, кудо, веле, ломань, шка, бандит, кедь, чи, пря, вирь
The 10 most frequent NOUN
types: лангс, ёнов, лангсо, бандитэсь, партизантнэ, ланга, ялгат, кедензэ, кудов, прянзо
The 10 most frequent ambiguous lemmas: тев (NOUN 55, ADV 4), ён (NOUN 55, ADJ 2, ADV 1), ведь (NOUN 24, PART 4), ашо (NOUN 14, ADJ 9), ве (NUM 18, NOUN 13, DET 10, ADV 2), ни (NOUN 13, X 2), чокшне (NOUN 11, ADV 1), экше (NOUN 8, ADJ 1), ков (ADV 21, NOUN 7), пандя (NOUN 7, VERB 1)
The 10 most frequent ambiguous types: лангс (NOUN 55, ADV 2), ланга (NOUN 19, ADV 2), тев (NOUN 13, ADV 4), ведь (NOUN 13, PART 1), пельде (NOUN 9, ADV 3, ADP 2), ладсо (NOUN 6, ADP 4), пиземе (NOUN 6, VERB 1), валт (NOUN 5, VERB 1), пелев (ADV 5, NOUN 5), потс (NOUN 5, ADV 3)
- лангс
- ланга
- тев
- ведь
- пельде
- ладсо
- пиземе
- валт
- пелев
- потс
Morphology
The form / lemma ratio of NOUN
is 2.205283 (the average of all parts of speech is 2.080194).
The 1st highest number of forms (24) was observed with the lemma “кудо”: Кудотнеде, кудо, кудов, кудованть, кудодо, кудозо, кудозонзо, кудонзо, кудонтень, кудонть, кудонь, кудос, кудосо, кудосонть, кудост, кудосто, кудостонть, кудось, кудоськак, кудот, кудоткак, кудотне, кудотнеяк, кудояк.
The 2nd highest number of forms (23) was observed with the lemma “веле”: Велентькак, Велесэнек, веле, велев, велева, велеванть, веледенть, велекс, велем, веленек, велентень, веленть, велень, велес, велестэ, велестэнть, велесь, велесэ, велесэнк, велесэнть, велетненень, велетнень, велетнестэ.
The 3rd highest number of forms (22) was observed with the lemma “кедь”: кедезэ, кедезэнзэ, кедензэ, кеденть, кедень, кедест, кедеть, кедте, кедтнеде, кедть, кедтькак, кедь, кедьс, кедьстэ, кедьстэнзэ, кедьсэ, кедьсэнзэ, кетьнесэ, кецтэнзэ, кецэ, кецэст, кецэтькак.
NOUN
occurs with 21 features: Case (5054; 99% instances), Number (5032; 99% instances), Definite (4250; 83% instances), Number[psor] (800; 16% instances), Person[psor] (800; 16% instances), NounType (203; 4% instances), Clitic (112; 2% instances), Animacy (81; 2% instances), Nomzr (50; 1% instances), Number[subj] (33; 1% instances), Person[subj] (33; 1% instances), Tense (33; 1% instances), Derivation (18; 0% instances), Degree (14; 0% instances), VerbForm (14; 0% instances), Typo (9; 0% instances), Abbr (8; 0% instances), Style (6; 0% instances), AdvType (4; 0% instances), NameType (4; 0% instances), NumType (1; 0% instances)
NOUN
occurs with 51 feature-value pairs: Abbr=Yes
, AdvType=Loc
, AdvType=Tim
, Animacy=Anim
, Animacy=Hum
, Case=Abe
, Case=Abl
, Case=Cmp
, Case=Com
, Case=Dat
, Case=Ela
, Case=Gen
, Case=Ill
, Case=Ine
, Case=Lat
, Case=Loc
, Case=Nom
, Case=Prl
, Case=Tem
, Case=Tra
, Clitic=Add
, Definite=Def
, Definite=Ind
, Degree=Dim
, Derivation=Omka
, Derivation=Voc
, Derivation=VocKaj
, NameType=Geo
, NameType=Sur
, Nomzr=Ag
, NounType=Relat
, NumType=Frac
, Number=Plur
, Number=Plur,Sing
, Number=Sing
, Number[psor]=Plur
, Number[psor]=Sing
, Number[subj]=Plur
, Number[subj]=Sing
, Person[psor]=1
, Person[psor]=2
, Person[psor]=3
, Person[subj]=1
, Person[subj]=2
, Person[subj]=3
, Style=Arch
, Tense=Past
, Tense=Pres
, Typo=Yes
, VerbForm=Part
, VerbForm=Vnoun
NOUN
occurs with 258 feature combinations.
The most frequent feature combination is Case=Nom|Definite=Ind|Number=Sing
(871 tokens).
Examples: ломань, тол, тев, ведь, сельме, сёвонь, веле, атя, вирь, гудок
Relations
NOUN
nodes are attached to their parents using 37 different relations: nsubj (1044; 21% instances), obl (1011; 20% instances), obj (765; 15% instances), nmod (721; 14% instances), obl:lmod (376; 7% instances), conj (239; 5% instances), compound (176; 3% instances), root (172; 3% instances), obl:inst (78; 2% instances), appos (70; 1% instances), vocative (60; 1% instances), obl:tmod (58; 1% instances), nsubj:cop (50; 1% instances), xcomp (35; 1% instances), nmod:poss (30; 1% instances), nmod:gobj (27; 1% instances), obl:cmp (23; 0% instances), fixed (22; 0% instances), advcl (19; 0% instances), orphan (19; 0% instances), amod (12; 0% instances), acl (10; 0% instances), discourse (10; 0% instances), dislocated (8; 0% instances), flat:name (8; 0% instances), nmod:gsubj (8; 0% instances), parataxis (8; 0% instances), ccomp (7; 0% instances), nmod:lmod (5; 0% instances), obl:agent (5; 0% instances), compound:nn (4; 0% instances), flat (4; 0% instances), acl:relcl (2; 0% instances), case (2; 0% instances), obl:own (2; 0% instances), csubj (1; 0% instances), nummod (1; 0% instances)
Parents of NOUN
nodes belong to 12 different parts of speech: VERB (3225; 63% instances), NOUN (1254; 25% instances), ADJ (198; 4% instances), (172; 3% instances), ADV (87; 2% instances), PRON (83; 2% instances), PROPN (35; 1% instances), AUX (15; 0% instances), DET (8; 0% instances), ADP (7; 0% instances), NUM (5; 0% instances), INTJ (3; 0% instances)
2267 (45%) NOUN
nodes are leaves.
1886 (37%) NOUN
nodes have one child.
599 (12%) NOUN
nodes have two children.
340 (7%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 8.
Children of NOUN
nodes are attached using 51 different relations: nmod (828; 19% instances), punct (816; 19% instances), amod (579; 13% instances), det (300; 7% instances), case (283; 6% instances), conj (233; 5% instances), compound (186; 4% instances), acl (135; 3% instances), nummod (130; 3% instances), nsubj (99; 2% instances), nmod:poss (96; 2% instances), cc (75; 2% instances), advmod (71; 2% instances), appos (56; 1% instances), acl:relcl (54; 1% instances), compound:nn (45; 1% instances), advcl (43; 1% instances), obl (42; 1% instances), aux:neg (34; 1% instances), cop (28; 1% instances), parataxis (24; 1% instances), discourse (21; 0% instances), nsubj:cop (19; 0% instances), mark (17; 0% instances), orphan (17; 0% instances), advmod:tmod (14; 0% instances), fixed (12; 0% instances), flat:name (11; 0% instances), advmod:lmod (10; 0% instances), obj (10; 0% instances), obl:lmod (10; 0% instances), vocative (10; 0% instances), advmod:foc (7; 0% instances), nmod:lmod (6; 0% instances), advmod:eval (5; 0% instances), flat (5; 0% instances), obl:tmod (5; 0% instances), nmod:gobj (4; 0% instances), advmod:deg (3; 0% instances), cc:preconj (3; 0% instances), xcomp (3; 0% instances), aux (2; 0% instances), aux:opt (2; 0% instances), ccomp (2; 0% instances), expl (2; 0% instances), obl:inst (2; 0% instances), aux:aspect (1; 0% instances), csubj (1; 0% instances), csubj:cop (1; 0% instances), dislocated (1; 0% instances), nmod:gsubj (1; 0% instances)
Children of NOUN
nodes belong to 15 different parts of speech: NOUN (1254; 29% instances), PUNCT (816; 19% instances), ADJ (569; 13% instances), VERB (319; 7% instances), PRON (295; 7% instances), ADP (268; 6% instances), DET (188; 4% instances), PROPN (181; 4% instances), ADV (159; 4% instances), NUM (132; 3% instances), CCONJ (75; 2% instances), AUX (69; 2% instances), PART (17; 0% instances), INTJ (15; 0% instances), SCONJ (7; 0% instances)