home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Cantonese-HK: POS Tags: NOUN

There are 525 NOUN lemmas (31%), 525 NOUN types (31%) and 2085 NOUN tokens (15%). Out of 15 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 3 in number of tokens.

The 10 most frequent NOUN lemmas: 個、 主席、 議員、 啲、 人、 問題、 會議、 而家、 規則、 嘢

The 10 most frequent NOUN types: 個、 主席、 議員、 啲、 人、 問題、 會議、 而家、 規則、 嘢

The 10 most frequent ambiguous lemmas: 個 (NOUN 205, PART 2), 啲 (NOUN 62, ADV 30, DET 2, PART 1), 而家 (NOUN 38, CCONJ 3), 選舉 (NOUN 24, VERB 4), 決定 (NOUN 23, VERB 10), 宣誓 (VERB 16, NOUN 13), 點 (NOUN 12, ADV 11), 嗰陣時 (NOUN 9, ADP 4, SCONJ 1), 份 (NOUN 8, PART 1), 分 (NOUN 7, VERB 1)

The 10 most frequent ambiguous types: 個 (NOUN 205, PART 2), 啲 (NOUN 62, ADV 30, DET 2, PART 1), 而家 (NOUN 38, CCONJ 3), 選舉 (NOUN 24, VERB 4), 決定 (NOUN 23, VERB 10), 宣誓 (VERB 16, NOUN 13), 點 (NOUN 12, ADV 11), 嗰陣時 (NOUN 9, ADP 4, SCONJ 1), 份 (NOUN 8, PART 1), 分 (NOUN 7, VERB 1)

Morphology

The form / lemma ratio of NOUN is 1.000000 (the average of all parts of speech is 1.001746).

The 1st highest number of forms (1) was observed with the lemma “CD”: CD.

The 2nd highest number of forms (1) was observed with the lemma “Declaration_of_Renunciation_of_UK_citizenship”: Declaration_of_Renunciation_of_UK_citizenship.

The 3rd highest number of forms (1) was observed with the lemma “Mean”: Mean.

NOUN occurs with 1 features: NounType (479; 23% instances)

NOUN occurs with 1 feature-value pairs: NounType=Clf

NOUN occurs with 2 feature combinations. The most frequent feature combination is _ (1606 tokens). Examples: 主席、 議員、 人、 問題、 會議、 而家、 規則、 嘢、 選舉、 今日

Relations

NOUN nodes are attached to their parents using 30 different relations: obj (648; 31% instances), nsubj (217; 10% instances), clf (212; 10% instances), obl (134; 6% instances), obl:tmod (132; 6% instances), compound (117; 6% instances), clf:det (109; 5% instances), nmod (101; 5% instances), conj (78; 4% instances), root (75; 4% instances), vocative (56; 3% instances), case:loc (47; 2% instances), flat (26; 1% instances), reparandum (22; 1% instances), appos (20; 1% instances), obj:periph (19; 1% instances), ccomp (14; 1% instances), compound:vo (14; 1% instances), dislocated (10; 0% instances), parataxis (9; 0% instances), nsubj:periph (5; 0% instances), xcomp (5; 0% instances), amod (3; 0% instances), advcl (2; 0% instances), case (2; 0% instances), iobj (2; 0% instances), obl:agent (2; 0% instances), obl:patient (2; 0% instances), acl (1; 0% instances), discourse:sp (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (1174; 56% instances), NOUN (481; 23% instances), NUM (122; 6% instances), DET (85; 4% instances), (75; 4% instances), ADJ (56; 3% instances), PROPN (41; 2% instances), PRON (25; 1% instances), ADP (12; 1% instances), AUX (8; 0% instances), ADV (5; 0% instances), PART (1; 0% instances)

859 (41%) NOUN nodes are leaves.

558 (27%) NOUN nodes have one child.

322 (15%) NOUN nodes have two children.

346 (17%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 14.

Children of NOUN nodes are attached using 34 different relations: punct (408; 16% instances), det (322; 12% instances), nmod (222; 9% instances), case (209; 8% instances), nummod (201; 8% instances), discourse:sp (192; 7% instances), acl (162; 6% instances), compound (149; 6% instances), amod (127; 5% instances), clf:det (103; 4% instances), conj (73; 3% instances), advmod (70; 3% instances), case:loc (61; 2% instances), cop (56; 2% instances), nsubj (49; 2% instances), appos (34; 1% instances), discourse (32; 1% instances), reparandum (27; 1% instances), cc (23; 1% instances), obl:tmod (15; 1% instances), obl (13; 1% instances), parataxis (9; 0% instances), vocative (9; 0% instances), advcl (5; 0% instances), dislocated (4; 0% instances), mark (4; 0% instances), mark:rel (4; 0% instances), ccomp (2; 0% instances), flat (2; 0% instances), advcl:coverb (1; 0% instances), aux (1; 0% instances), clf (1; 0% instances), csubj (1; 0% instances), nsubj:periph (1; 0% instances)

Children of NOUN nodes belong to 15 different parts of speech: NOUN (481; 19% instances), PUNCT (408; 16% instances), DET (309; 12% instances), PART (304; 12% instances), NUM (208; 8% instances), PRON (202; 8% instances), VERB (188; 7% instances), ADJ (130; 5% instances), ADP (123; 5% instances), ADV (84; 3% instances), AUX (60; 2% instances), PROPN (38; 1% instances), INTJ (28; 1% instances), CCONJ (23; 1% instances), SCONJ (6; 0% instances)