home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Japanese-GSD: POS Tags: NUM

There are 581 NUM lemmas (3%), 587 NUM types (2%) and 5163 NUM tokens (3%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 8 in number of tokens.

The 10 most frequent NUM lemmas: 1, 2, 3, 4, 一, 5, 10, 6, 8, 7

The 10 most frequent NUM types: 1, 2, 3, 4, 一, 5, 10, 6, 8, 7

The 10 most frequent ambiguous lemmas: 1 (NUM 421, NOUN 11), 2 (NUM 339, NOUN 9), 3 (NUM 325, NOUN 10), 4 (NUM 221, NOUN 4), 一 (NUM 189, PROPN 1), 5 (NUM 160, NOUN 3), 6 (NUM 142, NOUN 2), 8 (NUM 122, NOUN 2), 7 (NUM 119, NOUN 2), 数 (NOUN 60, NUM 34)

The 10 most frequent ambiguous types: 1 (NUM 421, NOUN 11), 2 (NUM 339, NOUN 9), 3 (NUM 325, NOUN 10), 4 (NUM 221, NOUN 4), 一 (NUM 169, PROPN 1), 5 (NUM 160, NOUN 3), 6 (NUM 142, NOUN 2), 8 (NUM 122, NOUN 2), 7 (NUM 119, NOUN 2), 数 (NOUN 60, NUM 34)

Morphology

The form / lemma ratio of NUM is 1.010327 (the average of all parts of speech is 1.115220).

The 1st highest number of forms (3) was observed with the lemma “一”: ひと, イチ, 一.

The 2nd highest number of forms (2) was observed with the lemma “三”: さん, 三.

The 3rd highest number of forms (2) was observed with the lemma “二”: ぷた, 二.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 9 different relations: nummod (2800; 54% instances), compound (2228; 43% instances), nmod (49; 1% instances), obl (27; 1% instances), root (20; 0% instances), nsubj (18; 0% instances), obj (14; 0% instances), nsubj:outer (4; 0% instances), advcl (3; 0% instances)

Parents of NUM nodes belong to 7 different parts of speech: NOUN (4892; 95% instances), VERB (95; 2% instances), NUM (91; 2% instances), ADV (38; 1% instances), PROPN (21; 0% instances), (20; 0% instances), ADJ (6; 0% instances)

4962 (96%) NUM nodes are leaves.

21 (0%) NUM nodes have one child.

39 (1%) NUM nodes have two children.

141 (3%) NUM nodes have three or more children.

The highest child degree of a NUM node is 14.

Children of NUM nodes are attached using 14 different relations: compound (378; 48% instances), case (179; 23% instances), punct (139; 18% instances), nmod (37; 5% instances), nsubj (19; 2% instances), acl (16; 2% instances), aux (12; 2% instances), nummod (4; 1% instances), advmod (3; 0% instances), cc (2; 0% instances), dep (2; 0% instances), det (1; 0% instances), mark (1; 0% instances), nsubj:outer (1; 0% instances)

Children of NUM nodes belong to 13 different parts of speech: NOUN (276; 35% instances), ADP (179; 23% instances), PUNCT (139; 18% instances), NUM (91; 11% instances), SYM (58; 7% instances), PROPN (17; 2% instances), VERB (13; 2% instances), AUX (12; 2% instances), ADV (3; 0% instances), CCONJ (2; 0% instances), PRON (2; 0% instances), DET (1; 0% instances), SCONJ (1; 0% instances)