home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Dutch-Alpino: POS Tags: NUM

There are 702 NUM lemmas (3%), 715 NUM types (3%) and 3549 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: twee, drie, één, hoeveel, vier, 1, vijf, tien, zes, 2

The 10 most frequent NUM types: twee, drie, hoeveel, een, vier, 1, vijf, tien, zes, 2

The 10 most frequent ambiguous lemmas: twee (NUM 268, ADJ 113, PROPN 1), drie (NUM 190, ADJ 47), één (ADJ 296, NUM 188, PROPN 1), vier (NUM 111, ADJ 21, NOUN 1), 1 (NUM 82, PROPN 8, ADJ 2), vijf (NUM 70, ADJ 18, NOUN 1), tien (NUM 68, NOUN 2), zes (NUM 60, ADJ 4, NOUN 2), 2 (NUM 50, PROPN 4, ADJ 1), 3 (NUM 49, ADJ 1, PROPN 1)

The 10 most frequent ambiguous types: twee (NUM 244, PROPN 1), een (DET 4131, NUM 129, CCONJ 1), 1 (NUM 82, PROPN 8), zes (NUM 57, NOUN 1), 2 (NUM 50, PROPN 4), 3 (NUM 49, PROPN 1), één (NUM 41, PROPN 1), acht (NUM 30, VERB 12, NOUN 1), zoveel (NUM 28, ADV 7), 10 (NUM 26, PROPN 1)

Morphology

The form / lemma ratio of NUM is 1.018519 (the average of all parts of speech is 1.221562).

The 1st highest number of forms (4) was observed with the lemma “drie”: drie, drieen, drietjes, drieën.

The 2nd highest number of forms (4) was observed with the lemma “één”: Eén, een, eentje, één.

The 3rd highest number of forms (3) was observed with the lemma “twee”: twee, tweeen, tweetjes.

NUM does not occur with any features.

Relations

NUM nodes are attached to their parents using 23 different relations: nummod (1888; 53% instances), obl (555; 16% instances), nmod (237; 7% instances), fixed (228; 6% instances), conj (129; 4% instances), appos (115; 3% instances), nsubj (71; 2% instances), flat (62; 2% instances), obj (56; 2% instances), parataxis (39; 1% instances), root (38; 1% instances), det (31; 1% instances), obl:arg (21; 1% instances), advcl (20; 1% instances), amod (20; 1% instances), nsubj:pass (15; 0% instances), acl:relcl (6; 0% instances), orphan (6; 0% instances), acl (3; 0% instances), ccomp (3; 0% instances), xcomp (3; 0% instances), iobj (2; 0% instances), obl:agent (1; 0% instances)

Parents of NUM nodes belong to 13 different parts of speech: NOUN (1995; 56% instances), VERB (687; 19% instances), NUM (243; 7% instances), PROPN (184; 5% instances), SYM (164; 5% instances), ADJ (88; 2% instances), X (82; 2% instances), (38; 1% instances), ADV (27; 1% instances), DET (21; 1% instances), PRON (15; 0% instances), ADP (4; 0% instances), CCONJ (1; 0% instances)

2072 (58%) NUM nodes are leaves.

823 (23%) NUM nodes have one child.

417 (12%) NUM nodes have two children.

237 (7%) NUM nodes have three or more children.

The highest child degree of a NUM node is 9.

Children of NUM nodes are attached using 26 different relations: case (750; 30% instances), punct (376; 15% instances), fixed (254; 10% instances), flat (233; 9% instances), nmod (228; 9% instances), amod (166; 7% instances), conj (139; 6% instances), cc (110; 4% instances), det (76; 3% instances), cop (30; 1% instances), nsubj (28; 1% instances), mark (18; 1% instances), advcl (15; 1% instances), parataxis (15; 1% instances), acl:relcl (11; 0% instances), advmod (10; 0% instances), acl (8; 0% instances), nmod:poss (7; 0% instances), orphan (7; 0% instances), obl (5; 0% instances), nummod (4; 0% instances), cc:preconj (3; 0% instances), appos (2; 0% instances), aux (2; 0% instances), csubj (2; 0% instances), expl (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: ADP (768; 31% instances), PUNCT (376; 15% instances), NOUN (266; 11% instances), NUM (243; 10% instances), PROPN (206; 8% instances), CCONJ (135; 5% instances), DET (125; 5% instances), ADJ (81; 3% instances), ADV (67; 3% instances), PRON (65; 3% instances), SYM (50; 2% instances), VERB (44; 2% instances), AUX (32; 1% instances), X (27; 1% instances), SCONJ (15; 1% instances)