Treebank Statistics: UD_Japanese-GSD: POS Tags: NOUN
There are 11908 NOUN
lemmas (56%), 12402 NOUN
types (53%) and 58184 NOUN
tokens (30%).
Out of 16 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: 年, 事, 月, 日, 御, 人, 者, 後, 為, 物
The 10 most frequent NOUN
types: 年, こと, 月, 日, 人, 者, お, 後, ため, もの
The 10 most frequent ambiguous lemmas: 日 (NOUN 514, ADV 1), 後 (NOUN 320, ADV 11), 為 (NOUN 304, SCONJ 63), 物 (NOUN 287, SCONJ 38), 中 (NOUN 248, ADV 1, PROPN 1), 様 (AUX 257, NOUN 172), 所 (NOUN 140, CCONJ 10, SCONJ 10, ADV 1), 前 (NOUN 115, ADV 3), 上 (NOUN 105, SCONJ 16), 共 (NOUN 98, SCONJ 12)
The 10 most frequent ambiguous types: 日 (NOUN 513, ADV 1), 後 (NOUN 308, ADV 1), ため (NOUN 280, SCONJ 61), もの (NOUN 256, SCONJ 38), 中 (NOUN 239, ADV 1, PROPN 1), さん (NOUN 173, NUM 1), よう (AUX 256, NOUN 130), 前 (NOUN 113, ADV 3), 上 (NOUN 102, SCONJ 13), 関係 (NOUN 90, VERB 5)
- 日
- 後
- ため
- もの
- 中
- さん
- よう
- 前
- 上
- 関係
Morphology
The form / lemma ratio of NOUN
is 1.041485 (the average of all parts of speech is 1.115220).
The 1st highest number of forms (5) was observed with the lemma “御”: お, ご, オ, ミ, 御.
The 2nd highest number of forms (5) was observed with the lemma “所”: とこ, ところ, どころ, 処, 所.
The 3rd highest number of forms (5) was observed with the lemma “真”: まこと, まっ, 真, 真っ, 誠.
NOUN
occurs with 1 features: Polarity (128; 0% instances)
NOUN
occurs with 1 feature-value pairs: Polarity=Neg
NOUN
occurs with 2 feature combinations.
The most frequent feature combination is _
(58056 tokens).
Examples: 年, こと, 月, 日, 人, 者, お, 後, ため, もの
Relations
NOUN
nodes are attached to their parents using 14 different relations: compound (19596; 34% instances), obl (11532; 20% instances), nmod (10985; 19% instances), nsubj (6779; 12% instances), obj (4978; 9% instances), root (2330; 4% instances), advcl (866; 1% instances), acl (655; 1% instances), nsubj:outer (400; 1% instances), ccomp (39; 0% instances), case (19; 0% instances), csubj (3; 0% instances), csubj:outer (1; 0% instances), fixed (1; 0% instances)
Parents of NOUN
nodes belong to 13 different parts of speech: NOUN (31329; 54% instances), VERB (21201; 36% instances), (2330; 4% instances), ADJ (1917; 3% instances), PROPN (916; 2% instances), NUM (276; 0% instances), ADV (158; 0% instances), PRON (39; 0% instances), AUX (10; 0% instances), SYM (3; 0% instances), INTJ (2; 0% instances), SCONJ (2; 0% instances), ADP (1; 0% instances)
19502 (34%) NOUN
nodes are leaves.
8542 (15%) NOUN
nodes have one child.
14754 (25%) NOUN
nodes have two children.
15386 (26%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 25.
Children of NOUN
nodes are attached using 24 different relations: case (34830; 35% instances), compound (24735; 25% instances), nmod (11972; 12% instances), punct (8402; 9% instances), acl (6717; 7% instances), nummod (2789; 3% instances), cop (2366; 2% instances), nsubj (1281; 1% instances), mark (973; 1% instances), det (966; 1% instances), aux (742; 1% instances), obl (709; 1% instances), fixed (457; 0% instances), amod (420; 0% instances), advmod (411; 0% instances), obj (346; 0% instances), cc (207; 0% instances), csubj (120; 0% instances), advcl (106; 0% instances), nsubj:outer (61; 0% instances), dep (41; 0% instances), discourse (6; 0% instances), csubj:outer (4; 0% instances), ccomp (1; 0% instances)
Children of NOUN
nodes belong to 16 different parts of speech: ADP (35083; 36% instances), NOUN (31329; 32% instances), PUNCT (8402; 9% instances), VERB (5488; 6% instances), NUM (4892; 5% instances), PROPN (4623; 5% instances), AUX (3200; 3% instances), ADJ (1835; 2% instances), DET (966; 1% instances), SYM (810; 1% instances), PART (525; 1% instances), SCONJ (449; 0% instances), ADV (424; 0% instances), PRON (424; 0% instances), CCONJ (207; 0% instances), INTJ (5; 0% instances)