home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Korean-GSD: POS Tags: PUNCT

There are 102 PUNCT lemmas (0%), 105 PUNCT types (0%) and 10411 PUNCT tokens (13%). Out of 16 observed tags, the rank of PUNCT is: 10 in number of lemmas, 10 in number of types and 4 in number of tokens.

The 10 most frequent PUNCT lemmas: ., ,, ‘, (, ), “, %, ?, !, •

The 10 most frequent PUNCT types: ., ,, ‘, (, ), “, %, ?, !, •

The 10 most frequent ambiguous lemmas: % (PUNCT 137, SYM 45), ? (PUNCT 134, SYM 1), ~ (PUNCT 69, SYM 1), 이+다 (PUNCT 19, NOUN 1, VERB 1), ㎡ (PUNCT 13, SYM 4), ㎞ (SYM 7, PUNCT 5), ^ (PUNCT 3, SYM 1), ℓ (PUNCT 3, SYM 1), ㎢ (PUNCT 3, SYM 3), ㎝ (PUNCT 2, SYM 1)

The 10 most frequent ambiguous types: % (PUNCT 137, SYM 45), ? (PUNCT 134, SYM 1), ~ (PUNCT 69, SYM 1), 이다 (PUNCT 14, AUX 1, NOUN 1, VERB 1), ㎡ (PUNCT 13, SYM 4), ㎞ (SYM 7, PUNCT 5), 다 (ADV 46, PUNCT 5, NOUN 3), ^ (PUNCT 3, SYM 1), ℓ (PUNCT 3, SYM 1), ㎢ (PUNCT 3, SYM 3)

Morphology

The form / lemma ratio of PUNCT is 1.029412 (the average of all parts of speech is 1.001499).

The 1st highest number of forms (2) was observed with the lemma “<”: <, <.

The 2nd highest number of forms (2) was observed with the lemma “이+다”: 다, 이다.

The 3rd highest number of forms (2) was observed with the lemma “이+었+다”: 였다, 이었다.

PUNCT occurs with 1 features: NumType (16; 0% instances)

PUNCT occurs with 1 feature-value pairs: NumType=Card

PUNCT occurs with 2 feature combinations. The most frequent feature combination is _ (10395 tokens). Examples: ., ,, ‘, (, ), “, %, ?, !, •

Relations

PUNCT nodes are attached to their parents using 1 different relations: punct (10411; 100% instances)

Parents of PUNCT nodes belong to 15 different parts of speech: VERB (4842; 47% instances), NOUN (3093; 30% instances), ADJ (824; 8% instances), SYM (428; 4% instances), NUM (371; 4% instances), PROPN (371; 4% instances), ADV (251; 2% instances), ADP (144; 1% instances), AUX (25; 0% instances), DET (22; 0% instances), PRON (18; 0% instances), PUNCT (8; 0% instances), CCONJ (6; 0% instances), INTJ (6; 0% instances), PART (2; 0% instances)

10407 (100%) PUNCT nodes are leaves.

0 (0%) PUNCT nodes have one child.

4 (0%) PUNCT nodes have two children.

The highest child degree of a PUNCT node is 2.

Children of PUNCT nodes are attached using 1 different relations: punct (8; 100% instances)

Children of PUNCT nodes belong to 1 different parts of speech: PUNCT (8; 100% instances)