home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Chinese-GSD: POS Tags: NOUN

There are 8161 NOUN lemmas (36%), 8162 NOUN types (36%) and 34046 NOUN tokens (28%). Out of 16 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: 年、 個、 月、 人、 日、 等、 種、 次、 人口、 名

The 10 most frequent NOUN types: 年、 個、 月、 日、 人、 等、 種、 次、 人口、 名

The 10 most frequent ambiguous lemmas: 年 (NOUN 1558, PART 6), 月 (NOUN 604, PART 1), 人 (NOUN 385, PART 240, VERB 1), 日 (NOUN 382, PROPN 53, PART 7, NUM 2), 等 (NOUN 231, VERB 4, PART 1), 種 (NOUN 187, PART 5, VERB 1), 次 (NOUN 149, VERB 4, PART 3, NUM 1), 名 (NOUN 128, PART 6, VERB 3), 大學 (NOUN 120, PROPN 1), 世界 (NOUN 107, PROPN 1)

The 10 most frequent ambiguous types: 年 (NOUN 1558, PART 6), 月 (NOUN 604, PART 1), 日 (NOUN 382, PROPN 53, PART 7, NUM 2), 人 (NOUN 365, PART 240, VERB 1), 等 (NOUN 231, VERB 3, PART 1), 種 (NOUN 187, PART 5, VERB 1), 次 (NOUN 149, VERB 4, PART 3, NUM 1), 名 (NOUN 128, PART 6, VERB 3), 大學 (NOUN 120, PROPN 1), 世界 (NOUN 107, PROPN 1)

Morphology

The form / lemma ratio of NOUN is 1.000123 (the average of all parts of speech is 1.004819).

The 1st highest number of forms (2) was observed with the lemma “人”: 人, 人們.

The 2nd highest number of forms (1) was observed with the lemma “m”: m.

The 3rd highest number of forms (1) was observed with the lemma “n=1”: n=1.

NOUN occurs with 1 features: Number (20; 0% instances)

NOUN occurs with 1 feature-value pairs: Number=Plur

NOUN occurs with 2 feature combinations. The most frequent feature combination is _ (34026 tokens). Examples: 年、 個、 月、 日、 人、 等、 種、 次、 人口、 名

Relations

NOUN nodes are attached to their parents using 25 different relations: nmod (9824; 29% instances), obj (5676; 17% instances), nsubj (5571; 16% instances), obl (2581; 8% instances), clf (2247; 7% instances), compound (1955; 6% instances), conj (1659; 5% instances), nmod:tmod (1554; 5% instances), acl (570; 2% instances), root (570; 2% instances), appos (515; 2% instances), parataxis (399; 1% instances), advcl (226; 1% instances), ccomp (193; 1% instances), nsubj:pass (157; 0% instances), obl:patient (141; 0% instances), xcomp (79; 0% instances), iobj (49; 0% instances), csubj (35; 0% instances), acl:relcl (15; 0% instances), amod (10; 0% instances), dislocated (10; 0% instances), nummod (6; 0% instances), case (2; 0% instances), orphan (2; 0% instances)

Parents of NOUN nodes belong to 14 different parts of speech: VERB (14743; 43% instances), NOUN (11414; 34% instances), PART (3699; 11% instances), NUM (2234; 7% instances), ADJ (765; 2% instances), (570; 2% instances), PROPN (471; 1% instances), X (65; 0% instances), ADP (36; 0% instances), PRON (18; 0% instances), ADV (14; 0% instances), DET (13; 0% instances), SYM (3; 0% instances), AUX (1; 0% instances)

14630 (43%) NOUN nodes are leaves.

8346 (25%) NOUN nodes have one child.

5418 (16%) NOUN nodes have two children.

5652 (17%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 14.

Children of NOUN nodes are attached using 32 different relations: nmod (12877; 31% instances), nummod (6104; 15% instances), case (5910; 14% instances), punct (4630; 11% instances), amod (1791; 4% instances), conj (1638; 4% instances), acl:relcl (1509; 4% instances), det (1369; 3% instances), cop (1153; 3% instances), nsubj (1099; 3% instances), cc (984; 2% instances), appos (863; 2% instances), acl (497; 1% instances), parataxis (357; 1% instances), clf (211; 1% instances), advmod (206; 0% instances), mark (81; 0% instances), advcl (70; 0% instances), obl (68; 0% instances), csubj (61; 0% instances), nmod:tmod (52; 0% instances), dislocated (36; 0% instances), compound (33; 0% instances), ccomp (20; 0% instances), xcomp (16; 0% instances), mark:rel (15; 0% instances), obj (12; 0% instances), aux (10; 0% instances), discourse (8; 0% instances), orphan (2; 0% instances), mark:adv (1; 0% instances), obl:patient (1; 0% instances)

Children of NOUN nodes belong to 16 different parts of speech: NOUN (11414; 27% instances), NUM (6201; 15% instances), PUNCT (4630; 11% instances), PART (4372; 10% instances), PROPN (3451; 8% instances), ADP (3365; 8% instances), VERB (2094; 5% instances), ADJ (1725; 4% instances), AUX (1164; 3% instances), DET (1144; 3% instances), CCONJ (981; 2% instances), PRON (590; 1% instances), X (244; 1% instances), ADV (205; 0% instances), SCONJ (85; 0% instances), SYM (19; 0% instances)