home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Korean-Kaist: POS Tags: VERB

There are 23258 VERB lemmas (23%), 23267 VERB types (23%) and 65646 VERB tokens (19%). Out of 17 observed tags, the rank of VERB is: 2 in number of lemmas, 2 in number of types and 2 in number of tokens.

The 10 most frequent VERB lemmas: 것+이+다, 이러하+ㄴ, 대하+ㄴ, 되+었+다, 되+ㄴ다, 하+ㄹ, 하+는, 때문+이+다, 하+ㄴ다, 것+이+ㅂ니다

The 10 most frequent VERB types: 것이다, 이러한, 대한, 된다, 되었다, 할, 하는, 때문이다, 한다, 것입니다

The 10 most frequent ambiguous lemmas: 것+이+다 (VERB 2338, SCONJ 1), 이러하+ㄴ (VERB 825, ADJ 10), 하+는 (VERB 394, PART 2, PROPN 2), 하+ㄴ다 (VERB 374, SCONJ 1), 하+었+다 (VERB 262, ADJ 1), 그러하+ㄴ (VERB 181, ADJ 3), 하+ㄴ (VERB 173, PART 2), 가지+고 (VERB 144, CCONJ 63, SCONJ 11), 관하+ㄴ (VERB 134, ADJ 1), 오+았+다 (VERB 128, CCONJ 1)

The 10 most frequent ambiguous types: 것이다 (VERB 2335, SCONJ 1), 이러한 (VERB 825, ADJ 10), 할 (VERB 444, AUX 263, NOUN 1), 하는 (VERB 394, AUX 198, PART 2, PROPN 2), 한다 (AUX 478, VERB 373, SCONJ 1), 볼 (VERB 280, NOUN 1), 중요한 (VERB 205, ADJ 6), 이런 (DET 206, VERB 197, ADJ 1), 그러한 (VERB 179, ADJ 3), 한 (NUM 577, VERB 173, ADJ 69, NOUN 46, AUX 41, PROPN 32, DET 4, PART 2)

Morphology

The form / lemma ratio of VERB is 1.000387 (the average of all parts of speech is 0.998034).

The 1st highest number of forms (4) was observed with the lemma “것+이+다”: 것디다, 것이다, 것이다라고, 게다.

The 2nd highest number of forms (4) was observed with the lemma “되+었+다”: 됐다, 되엇다, 되었다, 되였다.

The 3rd highest number of forms (3) was observed with the lemma “되+어야겠+습니다”: 돼야겠습니다, 되야겠습니다, 되어야겠습니다.

VERB does not occur with any features.

Relations

VERB nodes are attached to their parents using 18 different relations: root (20574; 31% instances), acl (19968; 30% instances), conj (7541; 11% instances), amod (5064; 8% instances), ccomp (4098; 6% instances), advcl (3639; 6% instances), compound (2954; 4% instances), nmod (1149; 2% instances), dep (312; 0% instances), obj (144; 0% instances), fixed (71; 0% instances), dislocated (48; 0% instances), obl (27; 0% instances), xcomp (22; 0% instances), nsubj (12; 0% instances), appos (11; 0% instances), flat (10; 0% instances), csubj (2; 0% instances)

Parents of VERB nodes belong to 12 different parts of speech: (20574; 31% instances), NOUN (17806; 27% instances), VERB (12664; 19% instances), ADV (5881; 9% instances), CCONJ (5812; 9% instances), SCONJ (1380; 2% instances), ADJ (674; 1% instances), PROPN (643; 1% instances), PRON (147; 0% instances), NUM (36; 0% instances), X (16; 0% instances), PART (13; 0% instances)

10427 (16%) VERB nodes are leaves.

17455 (27%) VERB nodes have one child.

10527 (16%) VERB nodes have two children.

27237 (41%) VERB nodes have three or more children.

The highest child degree of a VERB node is 10.

Children of VERB nodes are attached using 27 different relations: punct (20633; 14% instances), advcl (17685; 12% instances), obj (16349; 11% instances), dislocated (15609; 11% instances), ccomp (13129; 9% instances), obl (12988; 9% instances), nsubj (10730; 7% instances), advmod (9642; 7% instances), aux (7348; 5% instances), compound (4411; 3% instances), cc (4063; 3% instances), xcomp (3202; 2% instances), conj (2393; 2% instances), nmod (1835; 1% instances), amod (1549; 1% instances), dep (912; 1% instances), iobj (693; 0% instances), mark (610; 0% instances), csubj (582; 0% instances), acl (295; 0% instances), nummod (289; 0% instances), det (107; 0% instances), case (98; 0% instances), discourse (29; 0% instances), cop (18; 0% instances), vocative (14; 0% instances), appos (5; 0% instances)

Children of VERB nodes belong to 17 different parts of speech: NOUN (44080; 30% instances), ADV (30805; 21% instances), PUNCT (20633; 14% instances), VERB (12664; 9% instances), SCONJ (12295; 8% instances), CCONJ (8117; 6% instances), AUX (7366; 5% instances), PRON (4130; 3% instances), PROPN (3281; 2% instances), ADJ (853; 1% instances), NUM (596; 0% instances), DET (107; 0% instances), SYM (90; 0% instances), PART (73; 0% instances), ADP (63; 0% instances), INTJ (34; 0% instances), X (31; 0% instances)