home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_German-GSD: Features: Gender

This feature is universal. It occurs with 3 different values: Fem, Masc, Neut.

This is a layered feature with the following layers: Gender, Gender[psor].

133600 tokens (46%) have a non-empty value of Gender. 40697 types (80%) occur at least once with a non-empty value of Gender. 34924 lemmas (83%) occur at least once with a non-empty value of Gender. The feature is used with 9 part-of-speech tags: NOUN (50961; 17% instances), DET (35813; 12% instances), PROPN (26203; 9% instances), ADJ (14124; 5% instances), PRON (6249; 2% instances), NUM (102; 0% instances), X (79; 0% instances), ADV (59; 0% instances), SYM (10; 0% instances).

NOUN

50961 NOUN tokens (97% of all NOUN tokens) have a non-empty value of Gender.

The most frequent other feature values with which NOUN and Gender co-occurred: Number=Sing (36862; 72%).

NOUN tokens may have the following values of Gender:

Paradigm TagMascFemNeut
Case=Acc|Number=SingTag
Case=Acc|Number=PlurTage
Case=Dat|Number=SingTag, Tage
Case=Dat|Number=PlurTagen
Case=Gen|Number=SingTages, Tags
Case=Gen|Number=PlurTageTages
Case=Nom|Number=SingTagTage
Case=Nom|Number=PlurTage

Gender seems to be lexical feature of NOUN. 94% lemmas (16986) occur only with one value of Gender.

DET

35813 DET tokens (87% of all DET tokens) have a non-empty value of Gender.

The most frequent other feature values with which DET and Gender co-occurred: Number=Sing (33995; 95%), NumType=EMPTY (30232; 84%), PronType=Art (30045; 84%), Definite=Def (24595; 69%).

DET tokens may have the following values of Gender:

Paradigm derMascFemNeut
Case=Accdendiedas, 's
Case=Datdem, der, desder, diedem, das, des
Case=Gendesderdes, der
Case=Nomderdiedas

PROPN

26203 PROPN tokens (86% of all PROPN tokens) have a non-empty value of Gender.

The most frequent other feature values with which PROPN and Gender co-occurred: Number=Sing (25072; 96%).

PROPN tokens may have the following values of Gender:

Paradigm DeutschlandMascFemNeut
Case=AccDeutschland
Case=DatDeutschlandDeutschland
Case=GenDeutschlands, Deutschland
Case=NomDeutschlandDeutschland

Gender seems to be lexical feature of PROPN. 91% lemmas (13219) occur only with one value of Gender.

ADJ

14124 ADJ tokens (65% of all ADJ tokens) have a non-empty value of Gender.

The most frequent other feature values with which ADJ and Gender co-occurred: Degree=Pos (13050; 92%), Number=Sing (9925; 70%).

ADJ tokens may have the following values of Gender:

Paradigm erstMascFemNeut
Case=Acc|Number=Singerstenersteerste, erstes
Case=Acc|Number=Plurersten, ersteerste, erstenerste, ersten
Case=Dat|Number=Singerstenersten, ersterersten
Case=Dat|Number=Plurerstenerstenersten
Case=Gen|Number=Singerstenerstenersten
Case=Gen|Number=Plurerstenerstenersten
Case=Nom|Number=Singerste, ersterersteerste, erstes
Case=Nom|Number=Plurersten, ersteersten, ersteersten, Erste

PRON

6249 PRON tokens (58% of all PRON tokens) have a non-empty value of Gender.

The most frequent other feature values with which PRON and Gender co-occurred: Reflex=EMPTY (6249; 100%), Number=Sing (6221; 100%), Case=Nom (4875; 78%), PronType=Prs (4293; 69%), Person=3 (4269; 68%).

PRON tokens may have the following values of Gender:

Paradigm derMascFemNeut
Case=Accden, derdiedas
Case=Datdem, derderdem, Das
Case=Gendessenderen, der, dererdessen
Case=Nomder, diediedas, die

NUM

102 NUM tokens (1% of all NUM tokens) have a non-empty value of Gender.

The most frequent other feature values with which NUM and Gender co-occurred: NumType=Card (102; 100%).

NUM tokens may have the following values of Gender:

Paradigm 2MascFemNeut
Case=Acc2
Case=Dat2
Case=Nom2

X

79 X tokens (25% of all X tokens) have a non-empty value of Gender.

The most frequent other feature values with which X and Gender co-occurred: Foreign=EMPTY (79; 100%), Number=Sing (60; 76%).

X tokens may have the following values of Gender:

Paradigm B.MascFemNeut
Case=DatB.
Case=NomB.B.

Gender seems to be lexical feature of X. 92% lemmas (46) occur only with one value of Gender.

ADV

59 ADV tokens (0% of all ADV tokens) have a non-empty value of Gender.

ADV tokens may have the following values of Gender:

Paradigm caFemNeut
Case=Acccaca
Case=Datca

Gender seems to be lexical feature of ADV. 92% lemmas (44) occur only with one value of Gender.

SYM

10 SYM tokens (10% of all SYM tokens) have a non-empty value of Gender.

SYM tokens may have the following values of Gender:

Paradigm °MascFem
°°

Relations with Agreement in Gender

The 10 most frequent relations where parent and child node agree in Gender: NOUN –[det]–> DET (26061; 84%), NOUN –[amod]–> ADJ (11918; 91%), PROPN –[flat]–> PROPN (4766; 82%), PROPN –[det]–> DET (4540; 82%), NOUN –[det:poss]–> DET (2175; 95%), NOUN –[appos]–> PROPN (1763; 55%), PROPN –[conj]–> PROPN (1313; 63%), PROPN –[amod]–> PROPN (1060; 75%), NOUN –[compound]–> NOUN (667; 78%), PROPN –[flat]–> NOUN (660; 84%).