Treebank Statistics: UD_Estonian-EDT: POS Tags: X
There are 226 X
lemmas (1%), 302 X
types (0%) and 765 X
tokens (0%).
Out of 17 observed tags, the rank of X
is: 7 in number of lemmas, 9 in number of types and 14 in number of tokens.
The 10 most frequent X
lemmas: _, al, et, of, in, ceteris, de, paribus, F, a
The 10 most frequent X
types: 000, al., et, of, in, 900, 500, 600, 700, ceteris
The 10 most frequent ambiguous lemmas: et (SCONJ 4256, X 95), of (ADP 24, X 12, ADV 1), in (X 7, ADP 6), de (PROPN 26, X 5, ADP 1), a (NOUN 231, X 2, ADV 1), b (NOUN 16, X 1), no (INTJ 43, X 3), out (X 3, ADP 1), C (NOUN 5, X 2, PROPN 1), XX (ADJ 4, X 2)
The 10 most frequent ambiguous types: et (SCONJ 4086, X 94), of (ADP 22, X 12, ADV 1), in (X 7, ADP 6), 900 (NUM 8, X 7), 500 (NUM 22, X 5, ADJ 1), 600 (NUM 15, X 5), 700 (NUM 8, X 5), de (PROPN 26, X 5, ADP 1), a (NOUN 104, X 2, ADV 1), 400 (NUM 19, X 3)
- et
- of
- ADP 22: * “ State of the Worldi ” antakse välja rohkem kui 30 keeles
- X 12: Muusikalise poole eest hoolitsevad Bedwetters , House of Games , Hannaliisa Uusma ning surematu duo Toost ja Zak .
- ADV 1: See siin on Simply Redi armastuslugude best of , kuid enne kui midagi räägime , teeme selgeks , kus on point .
- in
- 900
- 500
- NUM 22: Räägitakse , et Tartu linna juhid lubavad vanglasse ainult 500 vangi .
- X 5: Seega on Tallinnas tervelt 31 500 miljonilist eramut .
- ADJ 1: Kindel on aga see , et maailma ühe juhtiva telekommunikatsioonifirma AT & T ligi 500 000 000 000-objektise andmetabeli [ 8 ] analüüsimisega jääksid traditsioonilised vahendid hätta
- 600
- 700
- de
- PROPN 26: Kindral de Gaulle ütles kaks korda ei .
- X 5: Ei ole ime , et tolleaegsetel Hispaania veinidel oli eriline maitse - olor de bota - ja sogane välimus .
- ADP 1: ” No , de jure oleks ta muidugi Eesti Sotsiaaldemokratliku Töölispartei mees , kuhu täna kuulub ka minu ema - sest nad ei ole harjunud parteisid vahetama .
- a
- NOUN 104: 1993. a sõitsin kongressile Udmurdimaale .
- X 2: a ) mulle ei meeldi teha seda , mida kõik teevad ;
- ADV 1: Kui inimene tahab rikkaks saada ( a miks muidu see Vargamäe Andres ja minu isa ja ema niipalju tööd tegid - ikka raha pärast , mis siis , et üks või teine mokaotsast ka jumalast või armastusest rääkis ) , siis ta teeb palju tööd ja loob väärtusi ja teistel on temast hea meel .
- 400
Morphology
The form / lemma ratio of X
is 1.336283 (the average of all parts of speech is 1.912964).
The 1st highest number of forms (76) was observed with the lemma “_”: ‘is, -1,5, -2, -3-rasvhapete, -45,8, -5, -6,5, -9,9, -aastased, -aastaselt, -kilobaidine, /1995, 000, 000-100, 000-l, 000-naelasele, 000-objektise, 000kroonine, 000ni, 000st, 02, 04, 083*1012, 090, 100, 150, 17, 17.00, 2, 20, 203, 257, 357, 371, 400, 402, 44, 479, 496, 500, 50aastased, 522, 547, 60, 600, 690, 692, 700, 756, 780, 782, 800, 83, 890, 892, 90, 900, 914, 930, 950, 950-kroonise, 951, 981, 996, Angeles-klassi, Angeles-klassile, aastased, arhitektuurides, e, keelsessegi, kroonine, mehelises, o, rian, trulli, °C.
The 2nd highest number of forms (2) was observed with the lemma “al”: al, al..
The 3rd highest number of forms (2) was observed with the lemma “et”: et, et..
X
occurs with 2 features: Foreign (424; 55% instances), Abbr (30; 4% instances)
X
occurs with 2 feature-value pairs: Abbr=Yes
, Foreign=Yes
X
occurs with 3 feature combinations.
The most frequent feature combination is Foreign=Yes
(424 tokens).
Examples: al., et, ceteris, de, paribus, in, tõ, Helicobacter, Marsa, khorji
Relations
X
nodes are attached to their parents using 18 different relations: goeswith (271; 35% instances), flat:foreign (211; 28% instances), flat (134; 18% instances), parataxis (39; 5% instances), root (23; 3% instances), nmod (21; 3% instances), appos (18; 2% instances), conj (16; 2% instances), ccomp (8; 1% instances), dep (4; 1% instances), nsubj (4; 1% instances), nsubj:cop (4; 1% instances), obj (4; 1% instances), obl (3; 0% instances), amod (2; 0% instances), discourse (1; 0% instances), fixed (1; 0% instances), orphan (1; 0% instances)
Parents of X
nodes belong to 10 different parts of speech: NUM (254; 33% instances), X (239; 31% instances), PROPN (116; 15% instances), NOUN (76; 10% instances), VERB (32; 4% instances), (23; 3% instances), ADJ (22; 3% instances), ADV (1; 0% instances), PRON (1; 0% instances), SYM (1; 0% instances)
535 (70%) X
nodes are leaves.
129 (17%) X
nodes have one child.
28 (4%) X
nodes have two children.
73 (10%) X
nodes have three or more children.
The highest child degree of a X
node is 9.
Children of X
nodes are attached using 24 different relations: flat:foreign (211; 43% instances), punct (179; 37% instances), flat (22; 5% instances), conj (20; 4% instances), advmod (7; 1% instances), parataxis (7; 1% instances), cc (5; 1% instances), cop (5; 1% instances), nummod (5; 1% instances), appos (4; 1% instances), nmod (3; 1% instances), nsubj:cop (3; 1% instances), obl (3; 1% instances), advcl (2; 0% instances), acl (1; 0% instances), acl:relcl (1; 0% instances), amod (1; 0% instances), case (1; 0% instances), cc:preconj (1; 0% instances), csubj:cop (1; 0% instances), dep (1; 0% instances), fixed (1; 0% instances), obj (1; 0% instances), orphan (1; 0% instances)
Children of X
nodes belong to 12 different parts of speech: X (239; 49% instances), PUNCT (179; 37% instances), NOUN (25; 5% instances), NUM (11; 2% instances), ADV (9; 2% instances), AUX (5; 1% instances), CCONJ (5; 1% instances), VERB (5; 1% instances), ADJ (3; 1% instances), PROPN (2; 0% instances), SYM (2; 0% instances), ADP (1; 0% instances)