Treebank Statistics: UD_Chinese-Beginner: POS Tags: NUM
There are 32 NUM
lemmas (2%), 32 NUM
types (2%) and 640 NUM
tokens (3%).
Out of 15 observed tags, the rank of NUM
is: 7 in number of lemmas, 7 in number of types and 9 in number of tokens.
The 10 most frequent NUM
lemmas: 一、 两、 十、 三、 几、 五、 1、 二、 八、 四
The 10 most frequent NUM
types: 一、 两、 十、 三、 几、 五、 1、 二、 八、 四
The 10 most frequent ambiguous lemmas: 一 (NUM 170, ADV 2, DET 2), 几 (NUM 41, DET 1, PRON 1), 二 (NUM 23, NOUN 1), 半 (NOUN 28, NUM 11), 第 (NUM 9, NOUN 3), 零 (NUM 5, ADV 1), 多 (ADJ 86, ADV 40, NUM 4), 双 (NOUN 2, NUM 1)
The 10 most frequent ambiguous types: 一 (NUM 170, ADV 2, DET 2), 几 (NUM 41, DET 1, PRON 1), 二 (NUM 23, NOUN 1), 半 (NOUN 28, NUM 11), 第 (NUM 9, NOUN 3), 零 (NUM 5, ADV 1), 多 (ADJ 86, ADV 40, NUM 4), 双 (NOUN 2, NUM 1)
- 一
- 几
- 二
- 半
- 第
- 零
- 多
- 双
Morphology
The form / lemma ratio of NUM
is 1.000000 (the average of all parts of speech is 1.000000).
The 1st highest number of forms (1) was observed with the lemma “0”: 0.
The 2nd highest number of forms (1) was observed with the lemma “1”: 1.
The 3rd highest number of forms (1) was observed with the lemma “2”: 2.
NUM
occurs with 1 features: NumType (599; 94% instances)
NUM
occurs with 2 feature-value pairs: NumType=Card
, NumType=Ord
NUM
occurs with 3 feature combinations.
The most frequent feature combination is NumType=Card
(592 tokens).
Examples: 一、 两、 十、 三、 几、 五、 1、 二、 八、 四
Relations
NUM
nodes are attached to their parents using 8 different relations: nummod (474; 74% instances), flat (106; 17% instances), dep (23; 4% instances), nmod (12; 2% instances), conj (10; 2% instances), obj (9; 1% instances), obl (3; 0% instances), root (3; 0% instances)
Parents of NUM
nodes belong to 4 different parts of speech: NOUN (486; 76% instances), NUM (139; 22% instances), VERB (12; 2% instances), (3; 0% instances)
501 (78%) NUM
nodes are leaves.
106 (17%) NUM
nodes have one child.
17 (3%) NUM
nodes have two children.
16 (3%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 7.
Children of NUM
nodes are attached using 16 different relations: flat (106; 53% instances), nummod (16; 8% instances), advmod (15; 7% instances), nmod (15; 7% instances), det (11; 5% instances), conj (10; 5% instances), punct (6; 3% instances), advcl (4; 2% instances), nsubj (4; 2% instances), cc (3; 1% instances), cop (3; 1% instances), obl (3; 1% instances), amod (2; 1% instances), case (1; 0% instances), discourse (1; 0% instances), parataxis (1; 0% instances)
Children of NUM
nodes belong to 11 different parts of speech: NUM (139; 69% instances), ADV (15; 7% instances), NOUN (12; 6% instances), DET (11; 5% instances), ADJ (7; 3% instances), PUNCT (6; 3% instances), AUX (3; 1% instances), CCONJ (3; 1% instances), PART (2; 1% instances), PRON (2; 1% instances), VERB (1; 0% instances)