home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_East_Slavic-RNC: POS Tags: PROPN

There are 3200 PROPN lemmas (25%), 5321 PROPN types (17%) and 12213 PROPN tokens (7%). Out of 17 observed tags, the rank of PROPN is: 2 in number of lemmas, 4 in number of types and 7 in number of tokens.

The 10 most frequent PROPN lemmas: Иванъ, Москва, Ивановичь, Русия, Василей, Борисъ, Васильевичь, Петръ, Скобѣевъ, Фролъ

The 10 most frequent PROPN types: Русии, Москвѣ, Ивана, Москве, Иван, Ивановичю, Скобеев, Борису, Фрол, Ивановича

The 10 most frequent ambiguous lemmas: _ (X 353, PROPN 11, NOUN 9, ADJ 1, CCONJ 1, NUM 1), гора (NOUN 54, PROPN 1), ветхий (ADJ 8, PROPN 1), А (X 6, PROPN 1), С (X 4, PROPN 1), мырза (NOUN 1, PROPN 1), уланъ (NOUN 2, PROPN 1)

The 10 most frequent ambiguous types: Руси (PROPN 36, NOUN 1), Иванов (PROPN 20, ADJ 1), Иванова (PROPN 18, ADJ 11), Федорова (PROPN 15, ADJ 2), Борисова (PROPN 11, ADJ 2), Савел(ь)ев (PROPN 11, ADJ 1), Васильев (PROPN 10, ADJ 1), Ивановъ (PROPN 9, ADJ 1), Семенова (PROPN 8, ADJ 2), поповъ (ADJ 1, PROPN 1)

Morphology

The form / lemma ratio of PROPN is 1.662813 (the average of all parts of speech is 2.481645).

The 1st highest number of forms (24) was observed with the lemma “Ивановичь”: Іванович, Івановича, Івановичь, Івановичю, Івановичꙋ, Іоанновичь, Ив[ановичю], Иванови[ч[а]], Иванови[чу], Иванович, Иванович[а], Иванович[е], Иванович[ь], Иванович[ѣ], Ивановича, Ивановиче, Ивановичем, Ивановичемъ, Ивановичи, Ивановичъ, Ивановичь, Ивановичю, Ивановичя, Ивановичѣ.

The 2nd highest number of forms (19) was observed with the lemma “Васильевичь”: ВАСИЛЬЕВИЧА, Васил[ь]евич, Васил[ь]евича, Васил[ь]евичем, Васил[ь]евичю, Васил[ь]євич[ь], Васильевич, Васильевича, Васильевиче, Васильевичем, Васильевичемъ, Васильевичи, Васильевичу, Васильевичъ, Васильевичь, Васильевичю, Васильевичя, Васильявича, Васильєвич[ь].

The 3rd highest number of forms (18) was observed with the lemma “Дмитриевичь”: Дмитреевич, Дмитреевич[а], Дмитреевич[е]мъ, Дмитреевич[ем], Дмитреевич[ю], Дмитреевича, Дмитреевичу, Дмитреевичъ, Дмитреевичю, Дмитреєвич[ь], Дмитриевич, Дмитриевич[ю], Дмитриевичь, Дмитриевичю, Дмитриевичя, Дмитріевича, Дмитріевичь, Дмитріевичѣ.

PROPN occurs with 8 features: NameType (12207; 100% instances), Case (12204; 100% instances), Gender (12204; 100% instances), Number (12204; 100% instances), Animacy (708; 6% instances), Abbr (6; 0% instances), InflClass (6; 0% instances), Typo (3; 0% instances)

PROPN occurs with 25 feature-value pairs: Abbr=Yes, Animacy=Anim, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Masc, Gender=Neut, InflClass=Ind, NameType=Geo, NameType=Giv, NameType=Oth, NameType=Pat, NameType=Pro, NameType=Prs, NameType=Sur, Number=Count, Number=Dual, Number=Plur, Number=Sing, Typo=Yes

PROPN occurs with 118 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|NameType=Giv|Number=Sing (2319 tokens). Examples: Иван, Фрол, Иванъ, Борис, Петр, Ивашко, Оска, Михайло, Семен, Аѳонка

Relations

PROPN nodes are attached to their parents using 25 different relations: appos (3533; 29% instances), flat:name (3394; 28% instances), obl (1718; 14% instances), conj (1191; 10% instances), nmod (843; 7% instances), nsubj (729; 6% instances), obj (169; 1% instances), root (139; 1% instances), iobj (135; 1% instances), orphan (100; 1% instances), compound (74; 1% instances), xcomp (41; 0% instances), nsubj:pass (36; 0% instances), parataxis (25; 0% instances), acl:relcl (22; 0% instances), vocative (22; 0% instances), amod (9; 0% instances), obl:agent (9; 0% instances), advcl (8; 0% instances), ccomp (7; 0% instances), obl:tmod (3; 0% instances), csubj (2; 0% instances), list (2; 0% instances), dep (1; 0% instances), dislocated (1; 0% instances)

Parents of PROPN nodes belong to 12 different parts of speech: PROPN (4425; 36% instances), NOUN (4232; 35% instances), VERB (2810; 23% instances), PRON (362; 3% instances), ADJ (197; 2% instances), (139; 1% instances), ADV (22; 0% instances), DET (10; 0% instances), AUX (6; 0% instances), PART (6; 0% instances), NUM (3; 0% instances), X (1; 0% instances)

5290 (43%) PROPN nodes are leaves.

3938 (32%) PROPN nodes have one child.

1922 (16%) PROPN nodes have two children.

1063 (9%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 45.

Children of PROPN nodes are attached using 33 different relations: flat:name (3441; 29% instances), case (2774; 23% instances), punct (1350; 11% instances), conj (1231; 10% instances), cc (810; 7% instances), appos (573; 5% instances), nmod (503; 4% instances), amod (432; 4% instances), det (359; 3% instances), nsubj (108; 1% instances), advmod (95; 1% instances), orphan (93; 1% instances), acl:relcl (35; 0% instances), cop (30; 0% instances), acl (26; 0% instances), dep (23; 0% instances), discourse (22; 0% instances), obl (20; 0% instances), list (18; 0% instances), parataxis (18; 0% instances), compound (14; 0% instances), mark (13; 0% instances), vocative (12; 0% instances), obl:tmod (7; 0% instances), iobj (6; 0% instances), advcl (3; 0% instances), nummod:gov (3; 0% instances), obj (3; 0% instances), nummod (2; 0% instances), aux (1; 0% instances), flat (1; 0% instances), nsubj:pass (1; 0% instances), obl:float (1; 0% instances)

Children of PROPN nodes belong to 16 different parts of speech: PROPN (4425; 37% instances), ADP (2767; 23% instances), NOUN (1506; 13% instances), PUNCT (1350; 11% instances), CCONJ (804; 7% instances), ADJ (443; 4% instances), DET (362; 3% instances), VERB (97; 1% instances), PART (93; 1% instances), PRON (59; 0% instances), ADV (41; 0% instances), AUX (31; 0% instances), X (22; 0% instances), SCONJ (19; 0% instances), NUM (8; 0% instances), INTJ (1; 0% instances)