home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-HDT: POS Tags: DET

There are 48 DET lemmas (0%), 195 DET types (0%) and 494367 DET tokens (14%). Out of 16 observed tags, the rank of DET is: 9 in number of lemmas, 9 in number of types and 2 in number of tokens.

The 10 most frequent DET lemmas: der, ein, dieser, sein, ihr, alle, anderer, kein, viel, einige

The 10 most frequent DET types: der, die, dem, den, das, des, eine, ein, einen, einer

The 10 most frequent ambiguous lemmas: der (DET 359943, PRON 28684, X 2), ein (DET 68956, ADP 1487, NUM 77), sein (AUX 43408, DET 9257), ihr (DET 7652, PRON 22), alle (DET 7640, ADJ 2), anderer (DET 5696, X 1), viel (DET 2862, ADV 373), mehr (ADV 6307, DET 1090, ADJ 4, X 3, PROPN 2), beide (ADJ 1011, DET 1005), solcher (DET 880, ADJ 6)

The 10 most frequent ambiguous types: der (DET 91439, PRON 4856, X 2), die (DET 77836, PRON 12604, X 2), dem (DET 66367, PRON 1681, X 1), den (DET 37055, PRON 620, ADJ 1, PROPN 1), das (DET 25405, PRON 4235, SCONJ 5, X 1), des (DET 22379, X 16, PROPN 3), eine (DET 17787, NUM 9), ein (DET 14652, ADP 1487, NUM 55), einen (DET 10089, NUM 1), einer (DET 9370, NUM 7)

Morphology

The form / lemma ratio of DET is 4.062500 (the average of all parts of speech is 2.529726).

The 1st highest number of forms (10) was observed with the lemma “ein”: ‘n, ein, eine, eine(n), einem, einem/er, einen, einer, eines, eins.

The 2nd highest number of forms (8) was observed with the lemma “anderer”: a., andere, anderem, anderen, anderer, anderes, andern, anders.

The 3rd highest number of forms (8) was observed with the lemma “derjenige”: dasjenige, demjenigen, denjenigen, derjenige, derjenigen, desjenigen, diejenige, diejenigen.

DET occurs with 14 features: PronType (494365; 100% instances), Number (493647; 100% instances), Case (489411; 99% instances), Definite (428897; 87% instances), Gender (395436; 80% instances), NumType (69961; 14% instances), Person (18373; 4% instances), Poss (18373; 4% instances), Number[psor] (16884; 3% instances), Gender[psor] (15470; 3% instances), Degree (1346; 0% instances), Polite (50; 0% instances), Foreign (17; 0% instances), Typo (13; 0% instances)

DET occurs with 35 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Definite=Def, Definite=Ind, Degree=Cmp, Degree=Pos, Degree=Sup, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Masc,Neut, Gender=Neut, Gender[psor]=Fem, Gender[psor]=Masc,Neut, NumType=Card, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Person=1, Person=2, Person=3, Polite=Form, Poss=Yes, PronType=Art, PronType=Dem, PronType=Ind, PronType=Int, PronType=Int,Rel, PronType=Neg, PronType=Prs, PronType=Tot, Typo=Yes

DET occurs with 315 feature combinations. The most frequent feature combination is Case=Dat|Definite=Def|Gender=Masc,Neut|Number=Sing|PronType=Art (47860 tokens). Examples: dem

Relations

DET nodes are attached to their parents using 21 different relations: det (480231; 97% instances), nsubj (3939; 1% instances), nmod (3044; 1% instances), obj (2089; 0% instances), obl (2046; 0% instances), root (1232; 0% instances), nsubj:pass (573; 0% instances), appos (386; 0% instances), conj (345; 0% instances), obl:arg (175; 0% instances), xcomp (66; 0% instances), advmod (50; 0% instances), fixed (41; 0% instances), advcl (38; 0% instances), ccomp (32; 0% instances), parataxis (29; 0% instances), acl (20; 0% instances), reparandum (15; 0% instances), det:poss (11; 0% instances), orphan (3; 0% instances), csubj (2; 0% instances)

Parents of DET nodes belong to 14 different parts of speech: NOUN (451320; 91% instances), PROPN (22519; 5% instances), X (7528; 2% instances), VERB (7519; 2% instances), ADJ (2398; 0% instances), (1232; 0% instances), DET (794; 0% instances), AUX (365; 0% instances), ADV (260; 0% instances), NUM (251; 0% instances), PRON (130; 0% instances), ADP (44; 0% instances), SCONJ (6; 0% instances), INTJ (1; 0% instances)

482595 (98%) DET nodes are leaves.

8424 (2%) DET nodes have one child.

2271 (0%) DET nodes have two children.

1077 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 10.

Children of DET nodes are attached using 27 different relations: case (4920; 28% instances), advmod (4109; 23% instances), nmod (2963; 17% instances), punct (2100; 12% instances), det (652; 4% instances), cop (476; 3% instances), acl (469; 3% instances), nsubj (455; 3% instances), obl (319; 2% instances), cc (303; 2% instances), conj (294; 2% instances), appos (133; 1% instances), obl:arg (82; 0% instances), mark (62; 0% instances), ccomp (46; 0% instances), parataxis (45; 0% instances), aux (40; 0% instances), advcl (38; 0% instances), reparandum (15; 0% instances), amod (12; 0% instances), csubj (12; 0% instances), fixed (9; 0% instances), xcomp (8; 0% instances), flat:name (5; 0% instances), expl (3; 0% instances), nummod (3; 0% instances), flat (1; 0% instances)

Children of DET nodes belong to 15 different parts of speech: ADP (4736; 27% instances), ADV (3496; 20% instances), NOUN (2665; 15% instances), PUNCT (2100; 12% instances), PROPN (1028; 6% instances), DET (794; 5% instances), ADJ (590; 3% instances), VERB (557; 3% instances), AUX (556; 3% instances), CCONJ (517; 3% instances), PRON (266; 2% instances), PART (161; 1% instances), NUM (41; 0% instances), X (38; 0% instances), SCONJ (29; 0% instances)