home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Old_East_Slavic-TOROT: POS Tags: NOUN

There are 4678 NOUN lemmas (34%), 17351 NOUN types (32%) and 53967 NOUN tokens (22%). Out of 14 observed tags, the rank of NOUN is: 1 in number of lemmas, 2 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: богъ, лѣто, кънязь, дьнь, земля, отьць, цьркы, градъ, людие, сынъ

The 10 most frequent NOUN types: лѣт, лѣт҃, дн҃ь, б҃ъ, землю, дн҃и, земли, б҃а, кнѧзь, бг҃ъ

The 10 most frequent ambiguous lemmas: человѣкъ (NOUN 356, PROPN 1), епископъ (NOUN 156, PROPN 1), грьчинъ (NOUN 135, PROPN 1), вѣсть (NOUN 70, ADV 1), вечеръ (NOUN 69, ADV 2), рѣчь (NOUN 62, ADV 4), добро (NOUN 55, ADV 7), словѣнинъ (NOUN 46, PROPN 1), печера (NOUN 37, PROPN 4), другъ (PRON 64, NOUN 34)

The 10 most frequent ambiguous types: дн҃ь (NOUN 356, ADV 1), б҃ъ (NOUN 333, PROPN 1), имѧ (NOUN 95, PRON 1, VERB 1), города (NOUN 88, PROPN 1), г҃и (NOUN 84, NUM 1), дѣти (NOUN 78, VERB 2), градѣ (NOUN 69, PROPN 1), б҃у (NOUN 67, PROPN 1), зла (NOUN 61, ADJ 18), дѣла (NOUN 60, VERB 1)

Morphology

The form / lemma ratio of NOUN is 3.709064 (the average of all parts of speech is 3.947827).

The 1st highest number of forms (107) was observed with the lemma “кънязь”: к[н҃зь, кнзе, кнзеи, кнзем, кнземъ, кнзи, кнзмъ, кнзь, кнзьми, кнзю, кнзя, кнз҃емь, кнз҃и, кнз҃ии, кнз҃ии:, кнз҃мь, кнз҃ь, кнз҃ю, кнз҃ѧ, кнз’, княже, княз, князе, князеи, князей, князем, княземъ, князи, князь, князьми, князю, князя, кнѕи, кнѕѧ, кнѕ҃е, кнѧ, кнѧже, кнѧж҃, кнѧз, кнѧзем, кнѧзема, кнѧземъ, кнѧземь, кнѧзем҃, кнѧзех, кнѧзехъ, кнѧзи, кнѧзии, кнѧзихъ, кнѧзъ, кнѧзь, кнѧзьмь, кнѧзю, кнѧзѣ, кнѧзѧ, кнѧзꙗ, кнѧѕех, кнѧѕи, кнѧѕь, кнѧѕѧ, кнѧ҃зѧ, кн҃же, кн҃зеи, кн҃зема, кн҃земъ, кн҃земь, кн҃зи, кн҃зихъ, кн҃змь, кн҃зь, кн҃зьма, кн҃зьмь, кн҃зю, кн҃зя, кн҃зѣ, кн҃зѣмь, кн҃зѣхъ, кн҃зѧ, кн҃з҃ѧ, кн҃ѕе, кн҃ѕем, кн҃ѕемъ, кн҃ѕехъ, кн҃ѕи, кн҃ѕь, кн҃ѕю, кн҃ѕѧ, кн҃ѧзеи, кн҃ѧзь, кн҃ѧзю, кн҃ѧзѧ, кнꙗзи, кнꙗзь, кнꙗзьмь, кнꙗзю, кнꙗзѧ, кнꙗзꙗ, къ[н]ѧз[ь, кънѧже, кънѧзоу, кънѧзь, кънѧзьхъ, кънѧзю, кънѧзѧ, кънѧзѹ, кънꙗземъ, к҃нзь.

The 2nd highest number of forms (88) was observed with the lemma “отьць”: о[ть]чеви, отец, отецъ, отець, отца, отцем, отцов, отцы, отьца, отьць, отьцю, отьче, оца҃, оцв҃и, оць҃, оцѧ҃, оц҃а, оц҃емъ, оц҃емь, оц҃и, оц҃мь, оц҃ь, оц҃ьмь, оц҃ю, оц҃ѧ, оц҃ѹ, оч҃е, оч҃и, оч҃ь, оч҃ѧ, о҃ца, о҃цъ, о҃че, ѡтецъ, ѡтець, ѡтцѧ, ѡтц҃а, ѡтц҃ъ, ѡтч҃е, ѡтьци, ѡци, ѡцтю, ѡці, ѡц҃а, ѡц҃емъ, ѡц҃емь, ѡц҃и, ѡц҃ихъ, ѡц҃мь, ѡц҃у, ѡц҃ь, ѡц҃ю, ѡц҃ѧ, ѡче, ѡчи, ѡч҃е, ѡ҃ца, ѡ҃цемъ, ѡ҃ци, ѡ҃ць, ѡ҃цю, ѡ҃ц҃ь, Ѿцо҃у, ѿца, ѿцем, ѿцемъ, ѿцу, ѿцъ, ѿцы, ѿць, ѿц҃а, ѿц҃евъ, ѿц҃ем, ѿц҃емъ, ѿц҃емь, ѿц҃и, ѿц҃овъ, ѿц҃оу, ѿц҃у, ѿц҃ъ, ѿц҃ы, ѿц҃ь, ѿц҃ѣхъ, ѿц҃ѹ, ѿц҃ꙋ, ѿче, ѿч҃, ѿч҃е.

The 3rd highest number of forms (86) was observed with the lemma “богъ”: «Богь, Бга҃, Бго҃у, Бгѡ҃мъ, Бг҃оу, Бг҃ѡмъ, Богом, Богь, Боу҃, ба, ба҃, бв҃и, бгм҃ь, бго҃мъ, бгу, бгъ, бгъ҃, бг҃а, бг҃ви, бг҃мь, бг҃ови, бг҃ом, бг҃омъ, бг҃у, бг҃ъ, бг҃ь, бг҃ѹ, бг҃҃а, бг҃ꙋ, бе, бж҃е, бз҃и, бз҃ѣ, бм҃ъ, бм҃ь, бог, бога, богмь, бого, богови, богу, богъ, богѹ, боже, бози, бозѣ, боѕѣ, бо҃мь, бу, бу҃, бъ, бъгъмь, бъ҃, бъ҃мь, бь҃, бѕѣ, бѕ҃ѣ, бѹ, бѹ҃, б҃, б҃а, б҃ви, б҃га, б҃гом, б҃гомъ, б҃гу, б҃гъ, б҃гѹ, б҃гꙋ, б҃е, б҃зи, б҃зѣ, б҃мъ, б҃мь, б҃о, б҃овъ, б҃омь, б҃оу, б҃у, б҃ъ, б҃ъмъ, б҃ы, б҃ь, б҃ѹ, б҃ꙋ, б꙽зѣ.

NOUN occurs with 3 features: Case (53587; 99% instances), Gender (53587; 99% instances), Number (53587; 99% instances)

NOUN occurs with 15 feature-value pairs: Case=Acc, Case=Dat, Case=Dat,Gen, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Gender=Fem, Gender=Fem,Masc, Gender=Masc, Gender=Neut, Number=Dual, Number=Plur, Number=Sing

NOUN occurs with 70 feature combinations. The most frequent feature combination is Case=Gen|Gender=Masc|Number=Sing (4904 tokens). Examples: б҃а, сн҃а, бг҃а, града, брата, кнѧзѧ, мс҃цѧ, города, мсца, мира

Relations

NOUN nodes are attached to their parents using 23 different relations: obl (15186; 28% instances), obj (8778; 16% instances), nsubj (7829; 15% instances), conj (6663; 12% instances), nmod (5015; 9% instances), appos (2776; 5% instances), obl:arg (2098; 4% instances), root (1698; 3% instances), vocative (1077; 2% instances), orphan (780; 1% instances), xcomp (383; 1% instances), nsubj:pass (321; 1% instances), advcl:cmp (308; 1% instances), obl:agent (287; 1% instances), acl (215; 0% instances), advcl (199; 0% instances), dislocated (157; 0% instances), dep (82; 0% instances), ccomp (60; 0% instances), parataxis (28; 0% instances), fixed (18; 0% instances), csubj (5; 0% instances), nsubj:outer (4; 0% instances)

Parents of NOUN nodes belong to 15 different parts of speech: VERB (33725; 62% instances), NOUN (11530; 21% instances), (1698; 3% instances), ADJ (1575; 3% instances), PROPN (1565; 3% instances), NUM (1352; 3% instances), AUX (1076; 2% instances), PRON (784; 1% instances), ADV (497; 1% instances), INTJ (49; 0% instances), CCONJ (46; 0% instances), SCONJ (36; 0% instances), ADP (26; 0% instances), DET (6; 0% instances), X (2; 0% instances)

13587 (25%) NOUN nodes are leaves.

21495 (40%) NOUN nodes have one child.

13034 (24%) NOUN nodes have two children.

5851 (11%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 52.

Children of NOUN nodes are attached using 30 different relations: case (17690; 26% instances), amod (11168; 16% instances), det (9815; 14% instances), cc (6662; 10% instances), conj (6451; 9% instances), nmod (4533; 7% instances), appos (3567; 5% instances), acl (1783; 3% instances), nummod (1627; 2% instances), advmod (1140; 2% instances), orphan (1054; 2% instances), nsubj (929; 1% instances), cop (901; 1% instances), discourse (497; 1% instances), mark (434; 1% instances), obl (234; 0% instances), advcl (122; 0% instances), dislocated (103; 0% instances), ccomp (69; 0% instances), aux (48; 0% instances), vocative (48; 0% instances), fixed (20; 0% instances), parataxis (16; 0% instances), advcl:cmp (9; 0% instances), obj (8; 0% instances), obl:agent (8; 0% instances), csubj (6; 0% instances), obl:arg (4; 0% instances), nsubj:outer (1; 0% instances), xcomp (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: ADP (17713; 26% instances), ADJ (11943; 17% instances), NOUN (11530; 17% instances), DET (7996; 12% instances), CCONJ (6669; 10% instances), PRON (3035; 4% instances), PROPN (2925; 4% instances), ADV (2313; 3% instances), VERB (1877; 3% instances), NUM (1763; 3% instances), AUX (1011; 1% instances), SCONJ (113; 0% instances), INTJ (54; 0% instances), X (6; 0% instances)