Treebank Statistics: UD_Chinese-PUD: POS Tags: X
There are 275 X
lemmas (5%), 275 X
types (5%) and 306 X
tokens (1%).
Out of 15 observed tags, the rank of X
is: 5 in number of lemmas, 5 in number of types and 13 in number of tokens.
The 10 most frequent X
lemmas: BBC、 CNN、 the、 Martin、 Anaya、 Andy、 B.C.、 Barrosos、 Catalano、 DNA
The 10 most frequent X
types: BBC、 CNN、 the、 Martin、 Anaya、 Andy、 B.C.、 Barrosos、 Catalano、 DNA
The 10 most frequent ambiguous lemmas: 中 (ADP 113, NOUN 6, X 1), 的 (PART 1361, X 1), 而 (ADV 47, CCONJ 1, X 1), 被 (AUX 79, ADP 22, X 1)
The 10 most frequent ambiguous types: 中 (ADP 113, NOUN 6, X 1), 的 (PART 1361, X 1), 而 (ADV 47, CCONJ 1, X 1), 被 (AUX 79, ADP 22, X 1)
- 中
- 的
- 而
- 被
Morphology
The form / lemma ratio of X
is 1.000000 (the average of all parts of speech is 1.006233).
The 1st highest number of forms (1) was observed with the lemma “Addenbrooke”: Addenbrooke.
The 2nd highest number of forms (1) was observed with the lemma “Adnan”: Adnan.
The 3rd highest number of forms (1) was observed with the lemma “Agora”: Agora.
X
occurs with 1 features: Foreign (91; 30% instances)
X
occurs with 1 feature-value pairs: Foreign=Yes
X
occurs with 2 feature combinations.
The most frequent feature combination is _
(215 tokens).
Examples: BBC、 CNN、 Martin、 Andy、 B.C.、 Barrosos、 DNA、 Dündar、 Facebook、 Leive
Relations
X
nodes are attached to their parents using 13 different relations: appos (104; 34% instances), flat (91; 30% instances), nsubj (32; 10% instances), compound (25; 8% instances), nmod (12; 4% instances), obj (11; 4% instances), obl (10; 3% instances), conj (8; 3% instances), dep (8; 3% instances), nsubj:pass (2; 1% instances), acl:relcl (1; 0% instances), discourse (1; 0% instances), root (1; 0% instances)
Parents of X
nodes belong to 9 different parts of speech: NOUN (90; 29% instances), X (87; 28% instances), VERB (66; 22% instances), PROPN (55; 18% instances), ADJ (2; 1% instances), NUM (2; 1% instances), PART (2; 1% instances), PRON (1; 0% instances), (1; 0% instances)
152 (50%) X
nodes are leaves.
41 (13%) X
nodes have one child.
63 (21%) X
nodes have two children.
50 (16%) X
nodes have three or more children.
The highest child degree of a X
node is 7.
Children of X
nodes are attached using 15 different relations: punct (196; 56% instances), flat (82; 24% instances), case (22; 6% instances), conj (9; 3% instances), cc (7; 2% instances), acl:relcl (6; 2% instances), nmod (6; 2% instances), case:loc (4; 1% instances), compound (4; 1% instances), appos (3; 1% instances), nummod (3; 1% instances), cop (2; 1% instances), nsubj (2; 1% instances), amod (1; 0% instances), mark:rel (1; 0% instances)
Children of X
nodes belong to 12 different parts of speech: PUNCT (196; 56% instances), X (87; 25% instances), PART (15; 4% instances), ADP (13; 4% instances), NOUN (9; 3% instances), CCONJ (7; 2% instances), PROPN (7; 2% instances), VERB (7; 2% instances), NUM (3; 1% instances), AUX (2; 1% instances), ADJ (1; 0% instances), PRON (1; 0% instances)