Treebank Statistics: UD_Russian-GSD: POS Tags: X
There are 1205 X
lemmas (6%), 1215 X
types (4%) and 1505 X
tokens (2%).
Out of 16 observed tags, the rank of X
is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.
The 10 most frequent X
lemmas: the, of, _, a, and, Airlines, company, de, music, to
The 10 most frequent X
types: the, of, a, and, Airlines, Music, Records, company, de, to
The 10 most frequent ambiguous lemmas: a (X 6, NOUN 2), де (PART 6, X 5), ISO (X 4, PROPN 2), windows (X 4, NOUN 1), F (X 3, NOUN 2), MTV (X 3, PROPN 1), i (NOUN 2, X 1), а (CCONJ 275, NOUN 10, X 3), и (CCONJ 2230, PART 119, X 3), стрит (X 2, NOUN 1)
The 10 most frequent ambiguous types: a (X 4, NOUN 2), де (PART 26, X 4), же (PART 115, X 5), C (X 4, ADP 2, NOUN 2), ISO (X 4, PROPN 2), Windows (X 4, NOUN 1), F (X 3, NOUN 2), I (ADJ 22, X 3), MTV (X 3, PROPN 1), а (CCONJ 261, X 3, NOUN 1)
- a
- де
- же
- C
- ISO
- X 4: Завод в Тале сертифицирован по стандарту DIN EN ISO 9001:2008 системы управления качеством продукции и по международным техническим условиям требований для компаний , занятых в производстве автомобильных комплектующих ISO / TS 16949:2002 .
- PROPN 2: Является подмножеством стандарта ISO 3166-2 , относящимся к Эквадору .
- Windows
- F
- I
- ADJ 22: Легион I Альпийский Юлиев ( ) – позднеримский легион .
- X 3: Касательно I Am … Sasha Fierce , отец , менеджер Бейонсе сказала :
Мы думали нестандартно и придумали кое-что новое '' , а Бейонсе объяснила :
Новая запись – это двойной альбом , у него две обложки , как у журнала есть две обложки '' .
- MTV
- а
- CCONJ 261: Мы не почувствовали , как вступили на землю Польши , а затем и Австрии .
- X 3: Абсолют делим ( бхеда ) и неделим ( а - бхеда ) в одно и то же время .
- NOUN 1: Эмблема Аки – стилизованное изображение знака японской силлабической азбуки ア ( а ) , представляющее название района в виде летящей птицы .
Morphology
The form / lemma ratio of X
is 1.008299 (the average of all parts of speech is 1.598617).
The 1st highest number of forms (8) was observed with the lemma “_”: ru, ЗЗ, бы, же, западе, нибудь, соm, таки.
The 2nd highest number of forms (2) was observed with the lemma “Hume”: Hume, Hume''s.
The 3rd highest number of forms (2) was observed with the lemma “boy”: Boy, Boys.
X
occurs with 3 features: Foreign (1483; 99% instances), Abbr (2; 0% instances), Typo (1; 0% instances)
X
occurs with 3 feature-value pairs: Abbr=Yes
, Foreign=Yes
, Typo=Yes
X
occurs with 4 feature combinations.
The most frequent feature combination is Foreign=Yes
(1482 tokens).
Examples: the, of, a, and, Airlines, Music, Records, company, de, to
Relations
X
nodes are attached to their parents using 21 different relations: flat:foreign (635; 42% instances), appos (368; 24% instances), conj (138; 9% instances), nmod (99; 7% instances), nsubj (74; 5% instances), obl (40; 3% instances), flat:name (39; 3% instances), list (14; 1% instances), compound (12; 1% instances), goeswith (12; 1% instances), obj (11; 1% instances), orphan (11; 1% instances), parataxis (10; 1% instances), amod (9; 1% instances), nsubj:pass (8; 1% instances), xcomp (7; 0% instances), obl:agent (5; 0% instances), root (5; 0% instances), iobj (4; 0% instances), case (3; 0% instances), cc (1; 0% instances)
Parents of X
nodes belong to 14 different parts of speech: X (846; 56% instances), NOUN (452; 30% instances), VERB (121; 8% instances), PROPN (48; 3% instances), ADJ (13; 1% instances), NUM (6; 0% instances), (5; 0% instances), CCONJ (4; 0% instances), SYM (4; 0% instances), ADP (2; 0% instances), ADV (1; 0% instances), DET (1; 0% instances), PART (1; 0% instances), SCONJ (1; 0% instances)
790 (52%) X
nodes are leaves.
291 (19%) X
nodes have one child.
155 (10%) X
nodes have two children.
269 (18%) X
nodes have three or more children.
The highest child degree of a X
node is 17.
Children of X
nodes are attached using 25 different relations: punct (680; 36% instances), flat:foreign (647; 34% instances), conj (145; 8% instances), appos (82; 4% instances), case (81; 4% instances), cc (51; 3% instances), flat:name (38; 2% instances), nummod:entity (36; 2% instances), nmod (23; 1% instances), amod (21; 1% instances), list (16; 1% instances), advmod (10; 1% instances), nummod (7; 0% instances), acl (6; 0% instances), dep (6; 0% instances), parataxis (6; 0% instances), acl:relcl (5; 0% instances), nummod:gov (4; 0% instances), orphan (3; 0% instances), advcl (2; 0% instances), det (2; 0% instances), nsubj (2; 0% instances), obl (2; 0% instances), compound (1; 0% instances), goeswith (1; 0% instances)
Children of X
nodes belong to 13 different parts of speech: X (846; 45% instances), PUNCT (680; 36% instances), ADP (73; 4% instances), NOUN (67; 4% instances), NUM (55; 3% instances), CCONJ (48; 3% instances), PROPN (34; 2% instances), ADJ (27; 1% instances), VERB (18; 1% instances), SYM (13; 1% instances), ADV (9; 0% instances), DET (4; 0% instances), PART (3; 0% instances)