home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-HDT: POS Tags: NUM

There are 6500 NUM lemmas (8%), 6504 NUM types (3%) and 71308 NUM tokens (2%). Out of 16 observed tags, the rank of NUM is: 5 in number of lemmas, 6 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30

The 10 most frequent NUM types: zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30

The 10 most frequent ambiguous lemmas: 20 (NUM 1030, X 1), 2002 (NUM 477, X 1), sieben (NUM 403, VERB 1), II (NUM 219, PROPN 1, X 1), x (X 104, NUM 13, NOUN 3), elf (NUM 113, X 1), 512 (NUM 84, X 1), ein (DET 68956, ADP 1487, NUM 77), i (NUM 9, NOUN 2, X 1), V (NOUN 39, NUM 34)

The 10 most frequent ambiguous types: 20 (NUM 1030, X 1), 2002 (NUM 477, X 1), sieben (NUM 403, VERB 1), II (NUM 219, PROPN 1, X 1), x (X 104, NUM 13), elf (NUM 113, X 1), eins (NUM 83, DET 26), 512 (NUM 84, X 1), ein (DET 14652, ADP 1487, NUM 55), i (NUM 9, NOUN 2, X 1)

Morphology

The form / lemma ratio of NUM is 1.000615 (the average of all parts of speech is 2.529726).

The 1st highest number of forms (5) was observed with the lemma “ein”: ein, eine, einem, einen, einer.

The 2nd highest number of forms (1) was observed with the lemma “’68”: ‘68.

The 3rd highest number of forms (1) was observed with the lemma “’95”: ‘95.

NUM occurs with 4 features: NumType (71307; 100% instances), Number (71302; 100% instances), Case (48; 0% instances), Gender (26; 0% instances)

NUM occurs with 10 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, NumType=Card, Number=Plur, Number=Sing

NUM occurs with 15 feature combinations. The most frequent feature combination is Number=Plur|NumType=Card (70456 tokens). Examples: zwei, 2000, drei, 2001, 1999, vier, fünf, 20, 100, 30

Relations

NUM nodes are attached to their parents using 18 different relations: nummod (48687; 68% instances), flat (8506; 12% instances), nmod (7856; 11% instances), obl (3579; 5% instances), conj (853; 1% instances), appos (526; 1% instances), obj (501; 1% instances), nsubj (465; 1% instances), nsubj:pass (154; 0% instances), root (88; 0% instances), xcomp (54; 0% instances), parataxis (18; 0% instances), obl:arg (10; 0% instances), advcl (4; 0% instances), reparandum (3; 0% instances), orphan (2; 0% instances), acl (1; 0% instances), ccomp (1; 0% instances)

Parents of NUM nodes belong to 11 different parts of speech: NOUN (60264; 85% instances), VERB (4150; 6% instances), PROPN (4097; 6% instances), X (1144; 2% instances), NUM (729; 1% instances), ADJ (556; 1% instances), ADV (115; 0% instances), AUX (103; 0% instances), (88; 0% instances), DET (41; 0% instances), PRON (21; 0% instances)

51707 (73%) NUM nodes are leaves.

15529 (22%) NUM nodes have one child.

3079 (4%) NUM nodes have two children.

993 (1%) NUM nodes have three or more children.

The highest child degree of a NUM node is 10.

Children of NUM nodes are attached using 23 different relations: advmod (12956; 51% instances), case (4311; 17% instances), nmod (3537; 14% instances), conj (1498; 6% instances), punct (1446; 6% instances), cc (543; 2% instances), appos (298; 1% instances), det (239; 1% instances), amod (126; 0% instances), nsubj (93; 0% instances), cop (90; 0% instances), flat:name (32; 0% instances), aux (25; 0% instances), flat (21; 0% instances), acl (9; 0% instances), ccomp (7; 0% instances), mark (5; 0% instances), advcl (4; 0% instances), nummod (4; 0% instances), parataxis (3; 0% instances), reparandum (3; 0% instances), xcomp (3; 0% instances), obl (2; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: ADV (11402; 45% instances), ADP (5421; 21% instances), NOUN (2953; 12% instances), ADJ (1650; 7% instances), PUNCT (1446; 6% instances), NUM (729; 3% instances), CCONJ (622; 2% instances), PROPN (296; 1% instances), DET (251; 1% instances), X (233; 1% instances), AUX (117; 0% instances), PRON (91; 0% instances), VERB (20; 0% instances), PART (19; 0% instances), SCONJ (5; 0% instances)