Treebank Statistics: UD_Chinese-GSD: POS Tags: NUM
There are 1255 NUM
lemmas (6%), 1255 NUM
types (6%) and 6660 NUM
tokens (5%).
Out of 16 observed tags, the rank of NUM
is: 4 in number of lemmas, 4 in number of types and 6 in number of tokens.
The 10 most frequent NUM
lemmas: 一、 兩、 三、 1、 第一、 3、 12、 5、 2、 8
The 10 most frequent NUM
types: 一、 兩、 三、 1、 第一、 3、 12、 5、 2、 8
The 10 most frequent ambiguous lemmas: 一 (NUM 1124, NOUN 1), 第一 (NUM 117, ADJ 1, PROPN 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 雙 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 單 (NUM 26, PART 2), 半 (NUM 24, PART 6), 數 (NUM 22, PART 15), 九 (NUM 16, PROPN 2), 眾多 (ADJ 8, NUM 8)
The 10 most frequent ambiguous types: 一 (NUM 1124, NOUN 1), 第一 (NUM 117, ADJ 1, PROPN 1), 多 (NUM 83, ADV 28, ADJ 16, PART 3), 雙 (NUM 35, NOUN 1), 很多 (NUM 33, ADJ 4), 單 (NUM 26, PART 2), 半 (NUM 24, PART 6), 數 (NUM 22, PART 15), 九 (NUM 16, PROPN 2), 眾多 (ADJ 8, NUM 8)
- 一
- 第一
- 多
- 雙
- 很多
- 單
- 半
- 數
- 九
- 眾多
Morphology
The form / lemma ratio of NUM
is 1.000000 (the average of all parts of speech is 1.004819).
The 1st highest number of forms (1) was observed with the lemma “-15”: -15.
The 2nd highest number of forms (1) was observed with the lemma “-154”: -154.
The 3rd highest number of forms (1) was observed with the lemma “-300”: -300.
NUM
occurs with 1 features: NumType (6659; 100% instances)
NUM
occurs with 2 feature-value pairs: NumType=Card
, NumType=Ord
NUM
occurs with 3 feature combinations.
The most frequent feature combination is NumType=Card
(6258 tokens).
Examples: 一、 兩、 三、 1、 3、 12、 5、 2、 8、 10
Relations
NUM
nodes are attached to their parents using 17 different relations: nummod (6237; 94% instances), obj (61; 1% instances), conj (58; 1% instances), obl (58; 1% instances), root (53; 1% instances), nmod (51; 1% instances), parataxis (44; 1% instances), nsubj (30; 0% instances), acl (12; 0% instances), appos (12; 0% instances), nmod:tmod (12; 0% instances), compound (10; 0% instances), advcl (6; 0% instances), amod (6; 0% instances), ccomp (5; 0% instances), xcomp (4; 0% instances), nsubj:pass (1; 0% instances)
Parents of NUM
nodes belong to 10 different parts of speech: NOUN (6201; 93% instances), VERB (171; 3% instances), PART (96; 1% instances), NUM (72; 1% instances), (53; 1% instances), PROPN (25; 0% instances), X (22; 0% instances), ADJ (16; 0% instances), DET (3; 0% instances), SYM (1; 0% instances)
4241 (64%) NUM
nodes are leaves.
2239 (34%) NUM
nodes have one child.
62 (1%) NUM
nodes have two children.
118 (2%) NUM
nodes have three or more children.
The highest child degree of a NUM
node is 7.
Children of NUM
nodes are attached using 25 different relations: clf (2033; 72% instances), punct (183; 6% instances), nmod (143; 5% instances), nsubj (97; 3% instances), cop (93; 3% instances), case (59; 2% instances), conj (57; 2% instances), cc (44; 2% instances), advmod (34; 1% instances), acl (22; 1% instances), det (15; 1% instances), parataxis (12; 0% instances), nummod (10; 0% instances), appos (8; 0% instances), nmod:tmod (7; 0% instances), obl (5; 0% instances), csubj (4; 0% instances), flat:foreign (4; 0% instances), mark (2; 0% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), amod (1; 0% instances), ccomp (1; 0% instances), obj (1; 0% instances), xcomp (1; 0% instances)
Children of NUM
nodes belong to 16 different parts of speech: NOUN (2234; 79% instances), PUNCT (183; 6% instances), AUX (93; 3% instances), PART (77; 3% instances), NUM (72; 3% instances), CCONJ (44; 2% instances), ADV (34; 1% instances), VERB (24; 1% instances), ADP (17; 1% instances), DET (15; 1% instances), PROPN (13; 0% instances), PRON (12; 0% instances), X (10; 0% instances), SYM (5; 0% instances), ADJ (3; 0% instances), SCONJ (2; 0% instances)