home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Classical_Chinese-Kyoto: POS Tags: NUM

There are 284 NUM lemmas (2%), 284 NUM types (2%) and 8133 NUM tokens (2%). Out of 14 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 9 in number of tokens.

The 10 most frequent NUM lemmas: 三、 一、 五、 二、 百、 四、 萬、 千、 十、 九

The 10 most frequent NUM types: 三、 一、 五、 二、 百、 四、 萬、 千、 十、 九

The 10 most frequent ambiguous lemmas: 一 (NUM 1207, VERB 29, ADV 1), 五 (NUM 612, PROPN 1), 二 (NUM 584, VERB 2), 百 (NUM 566, PROPN 1), 萬 (NUM 347, PROPN 30, VERB 1), 什 (NUM 16, PROPN 1), 丁 (PROPN 24, NUM 14, NOUN 3, VERB 2), 雙 (NOUN 19, NUM 11), 仲 (PROPN 63, NUM 10, NOUN 6), 兆 (NOUN 10, NUM 7, VERB 3, ADV 1, PROPN 1)

The 10 most frequent ambiguous types: 一 (NUM 1207, VERB 29, ADV 1), 五 (NUM 612, PROPN 1), 二 (NUM 584, VERB 2), 百 (NUM 566, PROPN 1), 萬 (NUM 347, PROPN 30, VERB 1), 什 (NUM 16, PROPN 1), 丁 (PROPN 24, NUM 14, NOUN 3, VERB 2), 雙 (NOUN 18, NUM 11), 仲 (PROPN 63, NUM 10, NOUN 6), 兆 (NOUN 10, NUM 7, VERB 3, ADV 1, PROPN 1)

Morphology

The form / lemma ratio of NUM is 1.000000 (the average of all parts of speech is 1.013130).

The 1st highest number of forms (1) was observed with the lemma “一”: 一.

The 2nd highest number of forms (1) was observed with the lemma “一二”: 一二.

The 3rd highest number of forms (1) was observed with the lemma “一十”: 一十.

NUM occurs with 1 features: NumType (141; 2% instances)

NUM occurs with 1 feature-value pairs: NumType=Ord

NUM occurs with 2 feature combinations. The most frequent feature combination is _ (7992 tokens). Examples: 三、 一、 五、 二、 百、 四、 萬、 千、 十、 九

Relations

NUM nodes are attached to their parents using 22 different relations: nummod (5420; 67% instances), root (1342; 17% instances), obj (461; 6% instances), nsubj (263; 3% instances), conj (224; 3% instances), compound (141; 2% instances), obl:tmod (128; 2% instances), acl (38; 0% instances), obl (33; 0% instances), flat (29; 0% instances), ccomp (12; 0% instances), nsubj:outer (10; 0% instances), advcl (7; 0% instances), parataxis (7; 0% instances), csubj (5; 0% instances), dislocated (3; 0% instances), iobj (3; 0% instances), obl:lmod (2; 0% instances), xcomp (2; 0% instances), clf (1; 0% instances), compound:redup (1; 0% instances), list (1; 0% instances)

Parents of NUM nodes belong to 9 different parts of speech: NOUN (4516; 56% instances), VERB (1642; 20% instances), (1342; 17% instances), NUM (273; 3% instances), PART (173; 2% instances), PROPN (168; 2% instances), ADV (8; 0% instances), PRON (7; 0% instances), AUX (4; 0% instances)

5133 (63%) NUM nodes are leaves.

1471 (18%) NUM nodes have one child.

1137 (14%) NUM nodes have two children.

392 (5%) NUM nodes have three or more children.

The highest child degree of a NUM node is 6.

Children of NUM nodes are attached using 33 different relations: clf (2026; 40% instances), nsubj (871; 17% instances), nmod (430; 9% instances), conj (395; 8% instances), csubj (347; 7% instances), case (241; 5% instances), amod (110; 2% instances), discourse:sp (110; 2% instances), advmod (78; 2% instances), nummod (54; 1% instances), nsubj:outer (49; 1% instances), cc (48; 1% instances), cop (48; 1% instances), det (48; 1% instances), parataxis (35; 1% instances), obl:tmod (27; 1% instances), flat (23; 0% instances), acl (22; 0% instances), obj (17; 0% instances), obl:lmod (10; 0% instances), obl (8; 0% instances), advcl (6; 0% instances), discourse (6; 0% instances), dislocated (6; 0% instances), aux (4; 0% instances), csubj:outer (4; 0% instances), list (3; 0% instances), expl (2; 0% instances), mark (2; 0% instances), nsubj:pass (2; 0% instances), compound:redup (1; 0% instances), vocative (1; 0% instances), xcomp (1; 0% instances)

Children of NUM nodes belong to 11 different parts of speech: NOUN (3191; 63% instances), VERB (584; 12% instances), PART (376; 7% instances), NUM (273; 5% instances), SCONJ (224; 4% instances), ADV (130; 3% instances), PRON (69; 1% instances), PROPN (58; 1% instances), ADP (57; 1% instances), AUX (53; 1% instances), CCONJ (20; 0% instances)