home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Russian-Taiga: POS Tags: DET

There are 57 DET lemmas (0%), 328 DET types (1%) and 5698 DET tokens (3%). Out of 17 observed tags, the rank of DET is: 14 in number of lemmas, 8 in number of types and 10 in number of tokens.

The 10 most frequent DET lemmas: этот, весь, такой, свой, мой, тот, другой, какой, один, сам

The 10 most frequent DET types: все, этот, его, такой, мой, этой, сам, эти, их, всех

The 10 most frequent ambiguous lemmas: этот (DET 830, PRON 7), весь (DET 624, PRON 7), тот (DET 293, PRON 5), один (DET 236, NUM 194), его (DET 169, X 1), многий (DET 45, ADJ 3), сей (DET 40, PRON 1), все (PRON 213, PART 10, DET 2), который (PRON 394, DET 2), это (PRON 961, PART 138, DET 2)

The 10 most frequent ambiguous types: все (PRON 381, DET 216, ADV 44, PART 12), его (PRON 205, DET 161, X 1), мой (DET 79, VERB 1), их (PRON 126, DET 77, X 2), всех (DET 84, PRON 44), это (PRON 579, PART 133, DET 81), этом (DET 84, PRON 75), один (NUM 64, DET 57), этого (PRON 110, DET 71), тот (DET 50, PRON 2)

Morphology

The form / lemma ratio of DET is 5.754386 (the average of all parts of speech is 1.879397).

The 1st highest number of forms (21) was observed with the lemma “какой-то”: какая, какая-то, какаято, какие, какие-то, каким, какими, какими-то, каких-то, какого, какого-то, какое, какое-то, какой, какой-то, какойто, каком-то, какому-то, какую, какую-то, кого.

The 2nd highest number of forms (17) was observed with the lemma “весь”: Всея, вeсь, весь, всëм, все, всего, всей, всем, всеми, всему, всех, всею, всю, вся, всё, всём, свей.

The 3rd highest number of forms (14) was observed with the lemma “другой”: др, др., другая, другие, другим, другими, других, другого, другое, другой, другом, другому, другую, дургие.

DET occurs with 10 features: PronType (5696; 100% instances), Number (5335; 94% instances), Case (5327; 93% instances), Gender (3745; 66% instances), Poss (1517; 27% instances), Animacy (906; 16% instances), Reflex (412; 7% instances), Typo (58; 1% instances), Abbr (19; 0% instances), Variant (9; 0% instances)

DET occurs with 27 feature-value pairs: Abbr=Yes, Animacy=Anim, Animacy=Inan, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Number=Plur, Number=Sing, Poss=Yes, PronType=Dem, PronType=Emp, PronType=Exc, PronType=Ind, PronType=Int, PronType=Neg, PronType=Prs, PronType=Rel, PronType=Tot, Reflex=Yes, Typo=Yes, Variant=Short

DET occurs with 270 feature combinations. The most frequent feature combination is Poss=Yes|PronType=Prs (344 tokens). Examples: его, их, ее, её, eго

Relations

DET nodes are attached to their parents using 24 different relations: det (4851; 85% instances), nsubj (191; 3% instances), obl (144; 3% instances), root (111; 2% instances), conj (80; 1% instances), acl (79; 1% instances), obj (58; 1% instances), fixed (45; 1% instances), nmod (35; 1% instances), iobj (21; 0% instances), parataxis (21; 0% instances), xcomp (17; 0% instances), ccomp (10; 0% instances), advcl (9; 0% instances), nsubj:pass (9; 0% instances), appos (4; 0% instances), advmod (3; 0% instances), orphan (3; 0% instances), acl:relcl (2; 0% instances), dep (1; 0% instances), flat (1; 0% instances), list (1; 0% instances), obl:agent (1; 0% instances), vocative (1; 0% instances)

Parents of DET nodes belong to 15 different parts of speech: NOUN (4424; 78% instances), VERB (436; 8% instances), ADJ (268; 5% instances), PRON (225; 4% instances), (111; 2% instances), PROPN (77; 1% instances), DET (57; 1% instances), ADP (45; 1% instances), NUM (26; 0% instances), ADV (11; 0% instances), X (6; 0% instances), PART (5; 0% instances), AUX (4; 0% instances), CCONJ (2; 0% instances), SYM (1; 0% instances)

4871 (85%) DET nodes are leaves.

518 (9%) DET nodes have one child.

155 (3%) DET nodes have two children.

154 (3%) DET nodes have three or more children.

The highest child degree of a DET node is 8.

Children of DET nodes are attached using 27 different relations: advmod (351; 25% instances), punct (217; 15% instances), nsubj (128; 9% instances), nmod (114; 8% instances), case (99; 7% instances), acl:relcl (81; 6% instances), cc (81; 6% instances), conj (80; 6% instances), obl (51; 4% instances), goeswith (40; 3% instances), parataxis (33; 2% instances), cop (25; 2% instances), det (24; 2% instances), acl (18; 1% instances), mark (18; 1% instances), fixed (14; 1% instances), orphan (8; 1% instances), appos (7; 0% instances), advcl (6; 0% instances), discourse (5; 0% instances), amod (3; 0% instances), expl (3; 0% instances), vocative (3; 0% instances), dislocated (2; 0% instances), list (2; 0% instances), aux (1; 0% instances), flat (1; 0% instances)

Children of DET nodes belong to 16 different parts of speech: PART (271; 19% instances), NOUN (224; 16% instances), PUNCT (217; 15% instances), VERB (116; 8% instances), ADV (98; 7% instances), ADP (96; 7% instances), PRON (86; 6% instances), CCONJ (85; 6% instances), DET (57; 4% instances), ADJ (52; 4% instances), X (43; 3% instances), AUX (26; 2% instances), SCONJ (20; 1% instances), PROPN (18; 1% instances), SYM (4; 0% instances), NUM (2; 0% instances)