Treebank Statistics: UD_Dutch-LassySmall: POS Tags: NOUN
There are 10350 NOUN
lemmas (38%), 12397 NOUN
types (37%) and 49462 NOUN
tokens (17%).
Out of 16 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: jaar, land, partij, stad, tijd, oorlog, naam, deel, eeuw, plaats
The 10 most frequent NOUN
types: jaar, oorlog, jaren, tijd, eeuw, stad, partij, deel, koning, naam
The 10 most frequent ambiguous lemmas: jaar (NOUN 661, PROPN 1), oorlog (NOUN 272, X 1), album (NOUN 164, X 1), tank (NOUN 150, X 1), leven (NOUN 109, VERB 53), dood (NOUN 80, ADJ 25), nummer (NOUN 80, X 25), rijk (NOUN 76, ADJ 37), weg (NOUN 76, ADV 36), uur (NOUN 67, X 6)
The 10 most frequent ambiguous types: jaar (NOUN 400, PROPN 1), oorlog (NOUN 240, X 1), begin (NOUN 132, VERB 2), landen (NOUN 127, VERB 3), album (NOUN 124, X 1), staat (NOUN 114, VERB 90), leven (NOUN 104, VERB 20), leden (NOUN 76, VERB 5), dood (NOUN 78, ADJ 8), rijk (NOUN 55, ADJ 10)
- jaar
- oorlog
- begin
- landen
- album
- staat
- leven
- leden
- dood
- rijk
Morphology
The form / lemma ratio of NOUN
is 1.197778 (the average of all parts of speech is 1.223407).
The 1st highest number of forms (5) was observed with the lemma “been”: been, beenderen, beentje, beentjes, benen.
The 2nd highest number of forms (5) was observed with the lemma “land”: land, lande, landen, landje, lands.
The 3rd highest number of forms (5) was observed with the lemma “stuk”: stuk, stukje, stukjes, stukken, stuks.
NOUN
occurs with 2 features: Number (49462; 100% instances), Gender (36680; 74% instances)
NOUN
occurs with 5 feature-value pairs: Gender=Com
, Gender=Com,Neut
, Gender=Neut
, Number=Plur
, Number=Sing
NOUN
occurs with 5 feature combinations.
The most frequent feature combination is Gender=Com|Number=Sing
(24994 tokens).
Examples: oorlog, tijd, eeuw, stad, partij, koning, naam, plaats, film, regering
Relations
NOUN
nodes are attached to their parents using 26 different relations: nmod (9455; 19% instances), obl (8522; 17% instances), nsubj (7280; 15% instances), obj (6251; 13% instances), conj (3865; 8% instances), root (2927; 6% instances), obl:arg (2546; 5% instances), nsubj:pass (2022; 4% instances), fixed (1766; 4% instances), appos (959; 2% instances), xcomp (917; 2% instances), parataxis (848; 2% instances), advcl (653; 1% instances), obl:agent (496; 1% instances), compound:prt (222; 0% instances), flat (205; 0% instances), iobj (105; 0% instances), amod (89; 0% instances), ccomp (79; 0% instances), acl:relcl (78; 0% instances), acl (66; 0% instances), orphan (56; 0% instances), case (27; 0% instances), csubj (20; 0% instances), nsubj:outer (6; 0% instances), nmod:poss (2; 0% instances)
Parents of NOUN
nodes belong to 15 different parts of speech: VERB (27349; 55% instances), NOUN (12922; 26% instances), (2927; 6% instances), ADJ (1873; 4% instances), PROPN (1539; 3% instances), ADP (1049; 2% instances), NUM (541; 1% instances), DET (343; 1% instances), PRON (294; 1% instances), ADV (249; 1% instances), X (242; 0% instances), SYM (108; 0% instances), SCONJ (22; 0% instances), INTJ (3; 0% instances), CCONJ (1; 0% instances)
4183 (8%) NOUN
nodes are leaves.
11967 (24%) NOUN
nodes have one child.
15336 (31%) NOUN
nodes have two children.
17976 (36%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 17.
Children of NOUN
nodes are attached using 34 different relations: det (28286; 26% instances), case (19953; 18% instances), amod (15198; 14% instances), nmod (14280; 13% instances), punct (7576; 7% instances), conj (3639; 3% instances), appos (3000; 3% instances), cc (2944; 3% instances), nmod:poss (2628; 2% instances), acl:relcl (2056; 2% instances), nummod (1804; 2% instances), cop (1546; 1% instances), nsubj (1420; 1% instances), mark (1346; 1% instances), acl (1300; 1% instances), parataxis (551; 1% instances), advmod (507; 0% instances), fixed (442; 0% instances), obl (433; 0% instances), flat (412; 0% instances), csubj (113; 0% instances), advcl (102; 0% instances), orphan (79; 0% instances), aux (62; 0% instances), expl (45; 0% instances), cc:preconj (37; 0% instances), obl:arg (19; 0% instances), ccomp (14; 0% instances), obj (5; 0% instances), iobj (2; 0% instances), aux:pass (1; 0% instances), compound:prt (1; 0% instances), nsubj:outer (1; 0% instances), xcomp (1; 0% instances)
Children of NOUN
nodes belong to 16 different parts of speech: DET (28425; 26% instances), ADP (20293; 18% instances), ADJ (13617; 12% instances), NOUN (12922; 12% instances), PUNCT (7576; 7% instances), PROPN (7535; 7% instances), VERB (4962; 5% instances), PRON (3112; 3% instances), CCONJ (2974; 3% instances), NUM (2836; 3% instances), ADV (1846; 2% instances), AUX (1609; 1% instances), SCONJ (1266; 1% instances), X (548; 0% instances), SYM (280; 0% instances), INTJ (2; 0% instances)