home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Kyrgyz-KTMU: POS Tags: PROPN

There are 737 PROPN lemmas (18%), 934 PROPN types (13%) and 2843 PROPN tokens (12%). Out of 13 observed tags, the rank of PROPN is: 3 in number of lemmas, 3 in number of types and 4 in number of tokens.

The 10 most frequent PROPN lemmas: кыргызстан, Бишкек, Ош, Ысык-Көл, Жалал-Абад, Баткен, Чүй, Россия, кыргыз, Нарын

The 10 most frequent PROPN types: Кыргызстанда, Бишкекте, кыргызстан, Ош, Кыргызстандын, Бишкек, Бишкектин, Жалал-Абад, Кыргызстанга, Ысык-Көлдө

The 10 most frequent ambiguous lemmas: Бишкек (PROPN 277, NOUN 1), Ысык-Көл (PROPN 78, NOUN 4), Россия (PROPN 36, NOUN 1), кыргыз (PROPN 15, CCONJ 1, NOUN 1), Кытай (PROPN 27, NOUN 2, ADJ 1), Өкм (PROPN 16, NOUN 1), япония (PROPN 4, NOUN 1), ЕАЭБ (PROPN 12, NOUN 1), Сузак (PROPN 12, NOUN 3), кыргызстандык (NOUN 17, PROPN 10)

The 10 most frequent ambiguous types: Кыргызстандын (PROPN 82, NOUN 2), Бишкектеги (PROPN 24, NOUN 1), Кыргызстандан (PROPN 17, NOUN 1), ГЭС (PROPN 9, NOUN 1), ИДПнын (PROPN 8, NOUN 1), түрк (NOUN 3, NUM 1, PROPN 1), Кыргызстандык (PROPN 6, NOUN 3), ЕАЭБ (PROPN 5, NOUN 1), Кыргызстандыктар (PROPN 5, NOUN 2), Россиядагы (PROPN 5, NOUN 1)

Morphology

The form / lemma ratio of PROPN is 1.267300 (the average of all parts of speech is 1.653599).

The 1st highest number of forms (7) was observed with the lemma “Бишкек”: Бишкек, Бишкекке, Бишкекте, Бишкектеги, Бишкектен, Бишкекти, Бишкектин.

The 2nd highest number of forms (7) was observed with the lemma “Казакстан”: Казакстан, Казакстанга, Казакстанда, Казакстандагы, Казакстандан, Казакстандык, Казакстандын.

The 3rd highest number of forms (7) was observed with the lemma “Кыргызстан”: Кыргызстан, Кыргызстанга, Кыргызстанда, Кыргызстандагы, Кыргызстандан, Кыргызстандык, Кыргызстандын.

PROPN occurs with 7 features: Case (2820; 99% instances), Number (2820; 99% instances), Person (2621; 92% instances), Person[psor] (293; 10% instances), Number[psor] (168; 6% instances), Abbr (163; 6% instances), PronType (18; 1% instances)

PROPN occurs with 20 feature-value pairs: Abbr=Yes, Case=Abl, Case=Abl,Gen, Case=Acc, Case=Acc,Gen, Case=Acc,Loc, Case=Dat, Case=Equ, Case=Gen, Case=Loc, Case=Nom, Number=Plur, Number=Sing, Number[psor]=Sing, Person=2, Person=3, Person[psor]=2, Person[psor]=3, PronType=Ind, PronType=Prs

PROPN occurs with 50 feature combinations. The most frequent feature combination is Case=Nom|Number=Sing|Person=3 (1259 tokens). Examples: кыргызстан, Ош, Бишкек, Жалал-Абад, кыргыз, Токтогул, Ысык-Көл, Чүй, Баткен, Нарын

Relations

PROPN nodes are attached to their parents using 11 different relations: nmod (1274; 45% instances), obl (832; 29% instances), nsubj (287; 10% instances), flat (180; 6% instances), conj (142; 5% instances), nmod:poss (65; 2% instances), compound (37; 1% instances), root (11; 0% instances), amod (10; 0% instances), obj (4; 0% instances), csubj (1; 0% instances)

Parents of PROPN nodes belong to 11 different parts of speech: VERB (1145; 40% instances), NOUN (1141; 40% instances), PROPN (462; 16% instances), ADJ (58; 2% instances), NUM (13; 0% instances), (11; 0% instances), ADV (6; 0% instances), PRON (3; 0% instances), CCONJ (2; 0% instances), ADP (1; 0% instances), PUNCT (1; 0% instances)

1948 (69%) PROPN nodes are leaves.

694 (24%) PROPN nodes have one child.

172 (6%) PROPN nodes have two children.

29 (1%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 4.

Children of PROPN nodes are attached using 19 different relations: punct (287; 25% instances), nmod (251; 22% instances), flat (178; 16% instances), conj (163; 14% instances), cc (123; 11% instances), compound (30; 3% instances), advmod (25; 2% instances), amod (16; 1% instances), nmod:poss (14; 1% instances), acl (10; 1% instances), case (10; 1% instances), obl (8; 1% instances), nummod (7; 1% instances), nsubj (4; 0% instances), advmod:emph (2; 0% instances), det (2; 0% instances), mark (2; 0% instances), ccomp (1; 0% instances), compound:svc (1; 0% instances)

Children of PROPN nodes belong to 10 different parts of speech: PROPN (462; 41% instances), PUNCT (287; 25% instances), NOUN (183; 16% instances), CCONJ (136; 12% instances), ADV (28; 2% instances), VERB (14; 1% instances), ADJ (11; 1% instances), NUM (10; 1% instances), DET (2; 0% instances), PRON (1; 0% instances)