home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Cantonese-HK: POS Tags: DET

There are 42 DET lemmas (2%), 42 DET types (2%) and 331 DET tokens (2%). Out of 15 observed tags, the rank of DET is: 8 in number of lemmas, 8 in number of types and 9 in number of tokens.

The 10 most frequent DET lemmas: 呢、 嗰、 依個、 任何、 其他、 呢個、 依、 咩、 嗰個、 每

The 10 most frequent DET types: 呢、 嗰、 依個、 任何、 其他、 呢個、 依、 咩、 嗰個、 每

The 10 most frequent ambiguous lemmas: 呢 (PART 328, DET 75, VERB 3, NOUN 1), 依個 (DET 25, PRON 9), 呢個 (PRON 20, DET 16, PART 2), 咩 (PRON 17, DET 15, PART 10), 嗰個 (DET 14, PRON 5), 呢啲 (DET 7, PRON 7), 下 (ADV 27, DET 6, INTJ 2, PART 2), 幾多 (DET 5, PRON 1), 依啲 (DET 3, PRON 1), 嗰啲 (PRON 7, DET 3)

The 10 most frequent ambiguous types: 呢 (PART 328, DET 75, VERB 3, NOUN 1), 依個 (DET 25, PRON 9), 呢個 (PRON 20, DET 16, PART 2), 咩 (PRON 17, DET 15, PART 10), 嗰個 (DET 14, PRON 5), 呢啲 (DET 7, PRON 7), 下 (ADV 27, DET 6, INTJ 2, PART 2), 幾多 (DET 5, PRON 1), 依啲 (DET 3, PRON 1), 嗰啲 (PRON 7, DET 3)

Morphology

The form / lemma ratio of DET is 1.000000 (the average of all parts of speech is 1.001746).

The 1st highest number of forms (1) was observed with the lemma “一個”: 一個.

The 2nd highest number of forms (1) was observed with the lemma “一切”: 一切.

The 3rd highest number of forms (1) was observed with the lemma “一啲”: 一啲.

DET does not occur with any features.

Relations

DET nodes are attached to their parents using 6 different relations: det (322; 97% instances), reparandum (4; 1% instances), advcl (2; 1% instances), cop (1; 0% instances), nmod (1; 0% instances), obj (1; 0% instances)

Parents of DET nodes belong to 7 different parts of speech: NOUN (309; 93% instances), VERB (11; 3% instances), PROPN (6; 2% instances), ADJ (2; 1% instances), DET (1; 0% instances), NUM (1; 0% instances), PART (1; 0% instances)

209 (63%) DET nodes are leaves.

114 (34%) DET nodes have one child.

5 (2%) DET nodes have two children.

3 (1%) DET nodes have three or more children.

The highest child degree of a DET node is 5.

Children of DET nodes are attached using 10 different relations: clf (84; 62% instances), case (30; 22% instances), punct (11; 8% instances), discourse (3; 2% instances), discourse:sp (2; 1% instances), acl (1; 1% instances), advmod (1; 1% instances), compound (1; 1% instances), nsubj (1; 1% instances), reparandum (1; 1% instances)

Children of DET nodes belong to 8 different parts of speech: NOUN (85; 63% instances), PART (32; 24% instances), PUNCT (11; 8% instances), INTJ (3; 2% instances), ADJ (1; 1% instances), ADV (1; 1% instances), DET (1; 1% instances), VERB (1; 1% instances)