Treebank Statistics: UD_Korean-Kaist: POS Tags: NOUN
There are 36315 NOUN
lemmas (36%), 36002 NOUN
types (36%) and 105020 NOUN
tokens (30%).
Out of 17 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: 수, 것+은, 것+이, 것+을, 때, 년, 가지, 것, 데, 등
The 10 most frequent NOUN
types: 수, 것은, 것이, 것을, 때, 년, 가지, 것, 데, 경우
The 10 most frequent ambiguous lemmas: 수 (NOUN 2457, ADJ 1, NUM 1, PROPN 1), 것+이 (NOUN 896, PRON 1), 자신+의 (NOUN 187, PRON 30), 당시 (NOUN 173, ADV 2), 이후 (NOUN 153, ADV 3), 중 (NOUN 135, PROPN 2), 문제+는 (NOUN 115, PROPN 2), 오늘날 (NOUN 111, ADV 2), 현재 (NOUN 111, ADV 6), 이상 (NOUN 88, PROPN 1)
The 10 most frequent ambiguous types: 수 (NOUN 2457, ADJ 2, NUM 1, PROPN 1), 것이 (NOUN 884, PRON 1), 가지 (NOUN 265, VERB 8), 자신의 (NOUN 187, PRON 30), 당시 (NOUN 173, ADV 2), 이후 (NOUN 153, ADV 3), 중 (NOUN 134, PROPN 2), 문제는 (NOUN 115, PROPN 2), 오늘날 (NOUN 112, ADV 2), 현재 (NOUN 110, ADV 6)
- 수
- NOUN 2457: 학생들은 교과서를 읽으면서 공부하다가 부딛치는 수많은 법학인물들을 이 사전을 통하여 보다 가까이 알 수 있게 될 것이다 .
- ADJ 2: 현재 한국에서 성취된 광통신 최고 속도는 한국전자통신연구소의 565 메가비트 시스템으로서 , 수 년 후에는 2 3 기가비트를 기대해 보고 있다 .
- NUM 1: 그리하여 동학조직은 수운 당시부터 매우 끈끈한 공동체로서 자리잡기 시작하여 그가 처형당한 후에도 수 십년간 지하 조직으로 존립을 가능하게 하였다 .
- PROPN 1: 제국의 경계는 , 동으로 한반도 북부 , 북으로 바이칼호와 이르티시 ( Irtish ) 강변 , 서로는 아랄해 , 남으로는 중국의 웨이수이 ( 수 ) 와 티베트 고원 , 그리고 카라코람산맥을 잇는 거대한 영토를 포함하게 되었다 .
- 것이
- 가지
- 자신의
- 당시
- 이후
- 중
- 문제는
- 오늘날
- 현재
Morphology
The form / lemma ratio of NOUN
is 0.991381 (the average of all parts of speech is 0.998034).
The 1st highest number of forms (4) was observed with the lemma “것+은”: 건, 건은, 것은, 것을.
The 2nd highest number of forms (3) was observed with the lemma “대하+어서+는”: 대하여서는, 대해서는, 대해선.
The 3rd highest number of forms (3) was observed with the lemma “등+의”: 드의, 등의, 등이.
NOUN
does not occur with any features.
Relations
NOUN
nodes are attached to their parents using 23 different relations: obj (20542; 20% instances), compound (18701; 18% instances), nmod (15343; 15% instances), dislocated (15028; 14% instances), nsubj (14281; 14% instances), obl (7833; 7% instances), conj (5711; 5% instances), amod (2974; 3% instances), dep (1985; 2% instances), csubj (1145; 1% instances), ccomp (404; 0% instances), appos (334; 0% instances), root (287; 0% instances), advcl (254; 0% instances), acl (121; 0% instances), fixed (34; 0% instances), flat (23; 0% instances), vocative (8; 0% instances), iobj (5; 0% instances), xcomp (3; 0% instances), nummod (2; 0% instances), clf (1; 0% instances), parataxis (1; 0% instances)
Parents of NOUN
nodes belong to 15 different parts of speech: VERB (44080; 42% instances), NOUN (26884; 26% instances), ADV (10195; 10% instances), CCONJ (9221; 9% instances), SCONJ (7911; 8% instances), ADJ (5110; 5% instances), PROPN (683; 1% instances), NUM (413; 0% instances), (287; 0% instances), PRON (111; 0% instances), X (67; 0% instances), PART (46; 0% instances), SYM (5; 0% instances), AUX (4; 0% instances), INTJ (3; 0% instances)
45834 (44%) NOUN
nodes are leaves.
44890 (43%) NOUN
nodes have one child.
11467 (11%) NOUN
nodes have two children.
2829 (3%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 10.
Children of NOUN
nodes are attached using 27 different relations: compound (15404; 20% instances), acl (15217; 20% instances), nmod (13569; 17% instances), amod (10568; 14% instances), punct (6286; 8% instances), det (3464; 4% instances), fixed (2890; 4% instances), nummod (2134; 3% instances), conj (2073; 3% instances), obl (869; 1% instances), case (761; 1% instances), appos (663; 1% instances), advmod (653; 1% instances), cc (580; 1% instances), nsubj (454; 1% instances), advcl (432; 1% instances), obj (413; 1% instances), dislocated (341; 0% instances), dep (250; 0% instances), cop (201; 0% instances), ccomp (171; 0% instances), aux (117; 0% instances), xcomp (81; 0% instances), iobj (32; 0% instances), csubj (16; 0% instances), mark (14; 0% instances), discourse (2; 0% instances)
Children of NOUN
nodes belong to 17 different parts of speech: NOUN (26884; 35% instances), VERB (17806; 23% instances), ADJ (7325; 9% instances), PUNCT (6286; 8% instances), PROPN (4598; 6% instances), DET (3464; 4% instances), CCONJ (3314; 4% instances), NUM (2506; 3% instances), ADV (1721; 2% instances), PRON (1701; 2% instances), ADP (687; 1% instances), SCONJ (556; 1% instances), AUX (320; 0% instances), X (311; 0% instances), PART (103; 0% instances), SYM (70; 0% instances), INTJ (3; 0% instances)