Treebank Statistics: UD_Arabic-NYUAD: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
477701 tokens (65%) have a non-empty value of Gender
.
1 types (0) occur at least once with a non-empty value of Gender
.
4838 lemmas (96%) occur at least once with a non-empty value of Gender
.
The feature is used with 16 part-of-speech tags: NOUN (221645; 30% instances), ADJ (69179; 9% instances), VERB (55373; 7% instances), PROPN (54272; 7% instances), PRON (43070; 6% instances), ADV (19509; 3% instances), DET (6065; 1% instances), AUX (4101; 1% instances), NUM (3526; 0% instances), X (482; 0% instances), ADP (192; 0% instances), PUNCT (154; 0% instances), CCONJ (88; 0% instances), SCONJ (25; 0% instances), PART (17; 0% instances), INTJ (3; 0% instances).
NOUN
221645 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (197322; 89%), Case=Gen (142652; 64%).
NOUN
tokens may have the following values of Gender
:
Fem
(67006; 30% of non-emptyGender
): _Masc
(154639; 70% of non-emptyGender
): _EMPTY
(254): _
ADJ
69179 ADJ tokens (100% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (66123; 96%), Definite=Def (45840; 66%), Case=Gen (40733; 59%).
ADJ
tokens may have the following values of Gender
:
Fem
(32171; 47% of non-emptyGender
): _Masc
(37008; 53% of non-emptyGender
): _EMPTY
(176): _
VERB
55373 VERB tokens (100% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Person=3 (51943; 94%), Voice=Act (51452; 93%), Mood=Ind (50158; 91%), Number=Sing (49732; 90%), Aspect=Perf (28891; 52%).
VERB
tokens may have the following values of Gender
:
Fem
(18355; 33% of non-emptyGender
): _Masc
(37018; 67% of non-emptyGender
): _EMPTY
(96): _
PROPN
54272 PROPN tokens (95% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (53581; 99%), Case=EMPTY (43512; 80%), Definite=Ind (40325; 74%).
PROPN
tokens may have the following values of Gender
:
Fem
(3150; 6% of non-emptyGender
): _Masc
(51122; 94% of non-emptyGender
): _EMPTY
(3149): _
Gender
seems to be lexical feature of PROPN
. 100% lemmas (4789) occur only with one value of Gender
.
PRON
43070 PRON tokens (99% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (36378; 84%), PronType=Prs (30458; 71%), Definite=Def (30207; 70%), Person=3 (29809; 69%).
PRON
tokens may have the following values of Gender
:
Fem
(15776; 37% of non-emptyGender
): _Masc
(27294; 63% of non-emptyGender
): _EMPTY
(425): _
ADV
19509 ADV tokens (81% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: Polarity=EMPTY (19509; 100%), Number=Sing (19508; 100%), Definite=Com (15109; 77%), Case=Acc (13032; 67%).
ADV
tokens may have the following values of Gender
:
Fem
(22; 0% of non-emptyGender
): _Masc
(19487; 100% of non-emptyGender
): _EMPTY
(4558): _
DET
6065 DET tokens (95% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Definite=Ind (6055; 100%), Number=Sing (5881; 97%).
DET
tokens may have the following values of Gender
:
Fem
(2244; 37% of non-emptyGender
): _Masc
(3821; 63% of non-emptyGender
): _EMPTY
(298): _
AUX
4101 AUX tokens (45% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Voice=Act (4076; 99%), Person=3 (3924; 96%), Number=Sing (3824; 93%), Mood=Ind (3347; 82%).
AUX
tokens may have the following values of Gender
:
Fem
(1375; 34% of non-emptyGender
): _Masc
(2726; 66% of non-emptyGender
): _EMPTY
(5054): _
NUM
3526 NUM tokens (23% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumForm=Word (3328; 94%), Number=Sing (3125; 89%), Definite=Com (2440; 69%), Case=Gen (2114; 60%).
NUM
tokens may have the following values of Gender
:
Fem
(1369; 39% of non-emptyGender
): _Masc
(2157; 61% of non-emptyGender
): _EMPTY
(11851): _
X
482 X tokens (52% of all X
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which X
and Gender
co-occurred: Number=Sing (424; 88%), Mood=EMPTY (285; 59%), Person=EMPTY (277; 57%), Voice=EMPTY (277; 57%).
X
tokens may have the following values of Gender
:
Fem
(73; 15% of non-emptyGender
): _Masc
(409; 85% of non-emptyGender
): _EMPTY
(445): _
Gender
seems to be lexical feature of X
. 93% lemmas (25) occur only with one value of Gender
.
ADP
192 ADP tokens (0% of all ADP
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADP
and Gender
co-occurred: AdpType=Prep (176; 92%).
ADP
tokens may have the following values of Gender
:
Fem
(53; 28% of non-emptyGender
): _Masc
(139; 72% of non-emptyGender
): _EMPTY
(91551): _
PUNCT
154 PUNCT tokens (0% of all PUNCT
tokens) have a non-empty value of Gender
.
PUNCT
tokens may have the following values of Gender
:
Fem
(46; 30% of non-emptyGender
): _Masc
(108; 70% of non-emptyGender
): _EMPTY
(75112): _
CCONJ
88 CCONJ tokens (0% of all CCONJ
tokens) have a non-empty value of Gender
.
CCONJ
tokens may have the following values of Gender
:
Fem
(28; 32% of non-emptyGender
): _Masc
(60; 68% of non-emptyGender
): _EMPTY
(49073): _
SCONJ
25 SCONJ tokens (0% of all SCONJ
tokens) have a non-empty value of Gender
.
SCONJ
tokens may have the following values of Gender
:
Fem
(2; 8% of non-emptyGender
): _Masc
(23; 92% of non-emptyGender
): _EMPTY
(16589): _
PART
17 PART tokens (1% of all PART
tokens) have a non-empty value of Gender
.
PART
tokens may have the following values of Gender
:
Fem
(2; 12% of non-emptyGender
): _Masc
(15; 88% of non-emptyGender
): _EMPTY
(2504): _
INTJ
3 INTJ tokens (5% of all INTJ
tokens) have a non-empty value of Gender
.
INTJ
tokens may have the following values of Gender
:
Masc
(3; 100% of non-emptyGender
): _EMPTY
(53): _
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (44733; 82%),
NOUN –[nmod:poss]–> NOUN (31172; 57%),
NOUN –[obj]–> NOUN (26620; 59%),
VERB –[obj]–> NOUN (18078; 55%),
VERB –[nsubj]–> NOUN (16417; 88%),
PROPN –[flat]–> PROPN (13240; 93%),
NOUN –[nmod:poss]–> PRON (9184; 58%),
NOUN –[nmod]–> NOUN (8141; 70%),
VERB –[iobj]–> NOUN (7589; 55%),
ADV –[nmod:poss]–> NOUN (7578; 70%).