Treebank Statistics: UD_Livvi-KKPP: POS Tags: X
There are 8 X
lemmas (1%), 10 X
types (1%) and 10 X
tokens (1%).
Out of 14 observed tags, the rank of X
is: 10 in number of lemmas, 11 in number of types and 13 in number of tokens.
The 10 most frequent X
lemmas: _, Karelija, d’engaa, eto, mi, piduhus, pučči, –
The 10 most frequent X
types: Karelija, d’engaa, eto, mi, piduhuttu, puččii, ttiteatr, u, y, –
The 10 most frequent ambiguous lemmas: Karelija (PROPN 1, X 1), mi (PRON 5, X 1), piduhus (NOUN 5, X 1), pučči (NOUN 2, X 1), – (PUNCT 16, X 1)
The 10 most frequent ambiguous types: Karelija (PROPN 1, X 1), d’engaa (NOUN 1, X 1), piduhuttu (NOUN 5, X 1), puččii (NOUN 2, X 1), – (PUNCT 16, X 1)
- Karelija
- PROPN 1: Festivualin hantuzis pietäh kaksi kilbua nuorih niškoi : Moja Karelija -videos’užietoin kilbu da Karelija – eto mi -karjalan , vepsän da suomen kieldy maltajien kilbu .
- X 1: Festivualin hantuzis pietäh kaksi kilbua nuorih niškoi : Moja Karelija -videos’užietoin kilbu da Karelija – eto mi -karjalan , vepsän da suomen kieldy maltajien kilbu .
- d’engaa
- piduhuttu
- NOUN 5: Sit oli häkki : seičče virstaa oli sarvet piduhuttu ; paimoi istui sarvel , toine toizel ; torvittih , ga toine toizen ei kuulluh torvindaa .
- X 1: ” Kuunelkaa nygöi , sanon saaraa , – häi sanoo , – konzu minun taatto oli bohattu , kolme virstaa oli kodi piduhuttu , a virstaa levevytty ; kezäkse kraassi koin mustal kraaskal , a talvekse valgiel .
- puččii
- –
Morphology
The form / lemma ratio of X
is 1.250000 (the average of all parts of speech is 1.335034).
The 1st highest number of forms (3) was observed with the lemma “_”: ttiteatr, u, y.
The 2nd highest number of forms (1) was observed with the lemma “Karelija”: Karelija.
The 3rd highest number of forms (1) was observed with the lemma “d’engaa”: d’engaa.
X
occurs with 3 features: Foreign (4; 40% instances), Case (3; 30% instances), Number (3; 30% instances)
X
occurs with 4 feature-value pairs: Case=Par
, Foreign=Yes
, Number=Plur
, Number=Sing
X
occurs with 4 feature combinations.
The most frequent feature combination is Foreign=Yes
(4 tokens).
Examples: Karelija, eto, mi, –
Relations
X
nodes are attached to their parents using 6 different relations: flat:foreign (3; 30% instances), goeswith (3; 30% instances), compound:nn (1; 10% instances), nmod (1; 10% instances), obj (1; 10% instances), parataxis (1; 10% instances)
Parents of X
nodes belong to 4 different parts of speech: NOUN (4; 40% instances), X (4; 40% instances), ADJ (1; 10% instances), VERB (1; 10% instances)
7 (70%) X
nodes are leaves.
0 (0%) X
nodes have one child.
1 (10%) X
nodes have two children.
2 (20%) X
nodes have three or more children.
The highest child degree of a X
node is 6.
Children of X
nodes are attached using 10 different relations: flat:foreign (3; 25% instances), cc (1; 8% instances), conj (1; 8% instances), cop (1; 8% instances), nmod (1; 8% instances), nsubj:cop (1; 8% instances), nummod (1; 8% instances), obj (1; 8% instances), parataxis (1; 8% instances), punct (1; 8% instances)
Children of X
nodes belong to 7 different parts of speech: X (4; 33% instances), NOUN (3; 25% instances), AUX (1; 8% instances), CCONJ (1; 8% instances), NUM (1; 8% instances), PUNCT (1; 8% instances), VERB (1; 8% instances)