home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Erzya-JR: POS Tags: NOUN

There are 1325 NOUN lemmas (40%), 2922 NOUN types (43%) and 5092 NOUN tokens (25%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: ланго, кудо, веле, ломань, шка, бандит, кедь, чи, пря, вирь

The 10 most frequent NOUN types: лангс, ёнов, лангсо, бандитэсь, партизантнэ, ланга, ялгат, кедензэ, кудов, прянзо

The 10 most frequent ambiguous lemmas: тев (NOUN 55, ADV 4), ён (NOUN 55, ADJ 2, ADV 1), ведь (NOUN 24, PART 4), ашо (NOUN 14, ADJ 9), ве (NUM 18, NOUN 13, DET 10, ADV 2), ни (NOUN 13, X 2), чокшне (NOUN 11, ADV 1), экше (NOUN 8, ADJ 1), ков (ADV 21, NOUN 7), пандя (NOUN 7, VERB 1)

The 10 most frequent ambiguous types: лангс (NOUN 55, ADV 2), ланга (NOUN 19, ADV 2), тев (NOUN 13, ADV 4), ведь (NOUN 13, PART 1), пельде (NOUN 9, ADV 3, ADP 2), ладсо (NOUN 6, ADP 4), пиземе (NOUN 6, VERB 1), валт (NOUN 5, VERB 1), пелев (ADV 5, NOUN 5), потс (NOUN 5, ADV 3)

Morphology

The form / lemma ratio of NOUN is 2.205283 (the average of all parts of speech is 2.080194).

The 1st highest number of forms (24) was observed with the lemma “кудо”: Кудотнеде, кудо, кудов, кудованть, кудодо, кудозо, кудозонзо, кудонзо, кудонтень, кудонть, кудонь, кудос, кудосо, кудосонть, кудост, кудосто, кудостонть, кудось, кудоськак, кудот, кудоткак, кудотне, кудотнеяк, кудояк.

The 2nd highest number of forms (23) was observed with the lemma “веле”: Велентькак, Велесэнек, веле, велев, велева, велеванть, веледенть, велекс, велем, веленек, велентень, веленть, велень, велес, велестэ, велестэнть, велесь, велесэ, велесэнк, велесэнть, велетненень, велетнень, велетнестэ.

The 3rd highest number of forms (22) was observed with the lemma “кедь”: кедезэ, кедезэнзэ, кедензэ, кеденть, кедень, кедест, кедеть, кедте, кедтнеде, кедть, кедтькак, кедь, кедьс, кедьстэ, кедьстэнзэ, кедьсэ, кедьсэнзэ, кетьнесэ, кецтэнзэ, кецэ, кецэст, кецэтькак.

NOUN occurs with 21 features: Case (5054; 99% instances), Number (5032; 99% instances), Definite (4250; 83% instances), Number[psor] (800; 16% instances), Person[psor] (800; 16% instances), NounType (203; 4% instances), Clitic (112; 2% instances), Animacy (81; 2% instances), Nomzr (50; 1% instances), Number[subj] (33; 1% instances), Person[subj] (33; 1% instances), Tense (33; 1% instances), Derivation (18; 0% instances), Degree (14; 0% instances), VerbForm (14; 0% instances), Typo (9; 0% instances), Abbr (8; 0% instances), Style (6; 0% instances), AdvType (4; 0% instances), NameType (4; 0% instances), NumType (1; 0% instances)

NOUN occurs with 51 feature-value pairs: Abbr=Yes, AdvType=Loc, AdvType=Tim, Animacy=Anim, Animacy=Hum, Case=Abe, Case=Abl, Case=Cmp, Case=Com, Case=Dat, Case=Ela, Case=Gen, Case=Ill, Case=Ine, Case=Lat, Case=Loc, Case=Nom, Case=Prl, Case=Tem, Case=Tra, Clitic=Add, Definite=Def, Definite=Ind, Degree=Dim, Derivation=Omka, Derivation=Voc, Derivation=VocKaj, NameType=Geo, NameType=Sur, Nomzr=Ag, NounType=Relat, NumType=Frac, Number=Plur, Number=Plur,Sing, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Number[subj]=Plur, Number[subj]=Sing, Person[psor]=1, Person[psor]=2, Person[psor]=3, Person[subj]=1, Person[subj]=2, Person[subj]=3, Style=Arch, Tense=Past, Tense=Pres, Typo=Yes, VerbForm=Part, VerbForm=Vnoun

NOUN occurs with 258 feature combinations. The most frequent feature combination is Case=Nom|Definite=Ind|Number=Sing (871 tokens). Examples: ломань, тол, тев, ведь, сельме, сёвонь, веле, атя, вирь, гудок

Relations

NOUN nodes are attached to their parents using 37 different relations: nsubj (1044; 21% instances), obl (1011; 20% instances), obj (765; 15% instances), nmod (721; 14% instances), obl:lmod (376; 7% instances), conj (239; 5% instances), compound (176; 3% instances), root (172; 3% instances), obl:inst (78; 2% instances), appos (70; 1% instances), vocative (60; 1% instances), obl:tmod (58; 1% instances), nsubj:cop (50; 1% instances), xcomp (35; 1% instances), nmod:poss (30; 1% instances), nmod:gobj (27; 1% instances), obl:cmp (23; 0% instances), fixed (22; 0% instances), advcl (19; 0% instances), orphan (19; 0% instances), amod (12; 0% instances), acl (10; 0% instances), discourse (10; 0% instances), dislocated (8; 0% instances), flat:name (8; 0% instances), nmod:gsubj (8; 0% instances), parataxis (8; 0% instances), ccomp (7; 0% instances), nmod:lmod (5; 0% instances), obl:agent (5; 0% instances), compound:nn (4; 0% instances), flat (4; 0% instances), acl:relcl (2; 0% instances), case (2; 0% instances), obl:own (2; 0% instances), csubj (1; 0% instances), nummod (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (3225; 63% instances), NOUN (1254; 25% instances), ADJ (198; 4% instances), (172; 3% instances), ADV (87; 2% instances), PRON (83; 2% instances), PROPN (35; 1% instances), AUX (15; 0% instances), DET (8; 0% instances), ADP (7; 0% instances), NUM (5; 0% instances), INTJ (3; 0% instances)

2267 (45%) NOUN nodes are leaves.

1886 (37%) NOUN nodes have one child.

599 (12%) NOUN nodes have two children.

340 (7%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 8.

Children of NOUN nodes are attached using 51 different relations: nmod (828; 19% instances), punct (816; 19% instances), amod (579; 13% instances), det (300; 7% instances), case (283; 6% instances), conj (233; 5% instances), compound (186; 4% instances), acl (135; 3% instances), nummod (130; 3% instances), nsubj (99; 2% instances), nmod:poss (96; 2% instances), cc (75; 2% instances), advmod (71; 2% instances), appos (56; 1% instances), acl:relcl (54; 1% instances), compound:nn (45; 1% instances), advcl (43; 1% instances), obl (42; 1% instances), aux:neg (34; 1% instances), cop (28; 1% instances), parataxis (24; 1% instances), discourse (21; 0% instances), nsubj:cop (19; 0% instances), mark (17; 0% instances), orphan (17; 0% instances), advmod:tmod (14; 0% instances), fixed (12; 0% instances), flat:name (11; 0% instances), advmod:lmod (10; 0% instances), obj (10; 0% instances), obl:lmod (10; 0% instances), vocative (10; 0% instances), advmod:foc (7; 0% instances), nmod:lmod (6; 0% instances), advmod:eval (5; 0% instances), flat (5; 0% instances), obl:tmod (5; 0% instances), nmod:gobj (4; 0% instances), advmod:deg (3; 0% instances), cc:preconj (3; 0% instances), xcomp (3; 0% instances), aux (2; 0% instances), aux:opt (2; 0% instances), ccomp (2; 0% instances), expl (2; 0% instances), obl:inst (2; 0% instances), aux:aspect (1; 0% instances), csubj (1; 0% instances), csubj:cop (1; 0% instances), dislocated (1; 0% instances), nmod:gsubj (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (1254; 29% instances), PUNCT (816; 19% instances), ADJ (569; 13% instances), VERB (319; 7% instances), PRON (295; 7% instances), ADP (268; 6% instances), DET (188; 4% instances), PROPN (181; 4% instances), ADV (159; 4% instances), NUM (132; 3% instances), CCONJ (75; 2% instances), AUX (69; 2% instances), PART (17; 0% instances), INTJ (15; 0% instances), SCONJ (7; 0% instances)