Treebank Statistics: UD_Estonian-EWT: POS Tags: PUNCT
There are 64 PUNCT
lemmas (1%), 67 PUNCT
types (0%) and 14846 PUNCT
tokens (16%).
Out of 17 observed tags, the rank of PUNCT
is: 10 in number of lemmas, 14 in number of types and 2 in number of tokens.
The 10 most frequent PUNCT
lemmas: ,, ., :, ?, “, …, !, ), -, (
The 10 most frequent PUNCT
types: ,, ., :, ?, “, …, !, ), (, -
The 10 most frequent ambiguous lemmas: ” (PUNCT 580, SYM 4), + (PUNCT 16, SYM 4), * (PUNCT 10, SYM 5), > (PUNCT 10, SYM 1), = (PUNCT 5, SYM 3), _ (X 91, PUNCT 3), ~ (PUNCT 2, SYM 2), -> (SYM 2, PUNCT 1), :’( (PUNCT 1, SYM 1), :) (SYM 25, INTJ 5, PUNCT 1)
The 10 most frequent ambiguous types: ” (PUNCT 578, SYM 4), - (PUNCT 325, X 1), + (PUNCT 16, SYM 4), * (PUNCT 10, SYM 5), > (PUNCT 10, SYM 1), = (PUNCT 5, SYM 3), ~ (PUNCT 2, SYM 2), -> (SYM 2, PUNCT 1), 8 (NUM 15, PUNCT 1), :’( (PUNCT 1, SYM 1)
- ”
- -
- +
- *
- >
- =
- ~
- ->
- SYM 2: Turbonohik : Tallin -> põhiliselt ikkagi Saku , aga ega A Le Coq-ist ka ära ei ütle …
- PUNCT 1: Tartu Ülikooli esimeses füüsika ( ma ei tea , mis aine , vana klassivend rääkis , ehk füüsikaline maailmapilt ? ) loengus küsiti , et kes teist panid keskkoolis füüsika tunnis tähele -> mõned arglikud käed .
- 8
- :’(
Morphology
The form / lemma ratio of PUNCT
is 1.046875 (the average of all parts of speech is 1.733702).
The 1st highest number of forms (2) was observed with the lemma “””: ”, ``.
The 2nd highest number of forms (2) was observed with the lemma “(”: (, 8.
The 3rd highest number of forms (2) was observed with the lemma “,”: ’, ,.
PUNCT
occurs with 2 features: Hyph (5; 0% instances), Typo (3; 0% instances)
PUNCT
occurs with 2 feature-value pairs: Hyph=Yes
, Typo=Yes
PUNCT
occurs with 3 feature combinations.
The most frequent feature combination is _
(14838 tokens).
Examples: ,, ., :, ?, “, …, !, ), (, -
Relations
PUNCT
nodes are attached to their parents using 2 different relations: punct (14840; 100% instances), root (6; 0% instances)
Parents of PUNCT
nodes belong to 17 different parts of speech: VERB (7717; 52% instances), NOUN (2882; 19% instances), PROPN (1526; 10% instances), ADJ (1229; 8% instances), ADV (652; 4% instances), PRON (479; 3% instances), NUM (165; 1% instances), INTJ (105; 1% instances), X (32; 0% instances), SYM (23; 0% instances), AUX (14; 0% instances), DET (6; 0% instances), (6; 0% instances), ADP (5; 0% instances), PUNCT (2; 0% instances), SCONJ (2; 0% instances), CCONJ (1; 0% instances)
14844 (100%) PUNCT
nodes are leaves.
1 (0%) PUNCT
nodes have one child.
0 (0%) PUNCT
nodes have two children.
1 (0%) PUNCT
nodes have three or more children.
The highest child degree of a PUNCT
node is 3.
Children of PUNCT
nodes are attached using 2 different relations: flat (2; 50% instances), punct (2; 50% instances)
Children of PUNCT
nodes belong to 2 different parts of speech: PUNCT (2; 50% instances), SYM (2; 50% instances)