Treebank Statistics: UD_Spanish-AnCora: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
202799 tokens (36%) have a non-empty value of Gender
.
17312 types (45%) occur at least once with a non-empty value of Gender
.
11621 lemmas (45%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (88482; 16% instances), DET (78759; 14% instances), ADJ (24227; 4% instances), PRON (5804; 1% instances), VERB (4754; 1% instances), AUX (481; 0% instances), NUM (290; 0% instances), PROPN (2; 0% instances).
NOUN
88482 NOUN tokens (88% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (61626; 70%).
NOUN
tokens may have the following values of Gender
:
Fem
(41292; 47% of non-emptyGender
): pesetas, personas, parte, vida, situación, vez, forma, elecciones, empresa, decisiónMasc
(47190; 53% of non-emptyGender
): años, presidente, millones, equipo, partido, país, año, ministro, mundo, grupoEMPTY
(12053): parte, frente, portavoz, líder, respecto, vez, pese, policía, año, partir
Paradigm candidato | Masc | Fem |
---|---|---|
Number=Sing | candidato | |
Number=Plur | candidatos | CANDIDATAS |
Gender
seems to be lexical feature of NOUN
. 99% lemmas (7756) occur only with one value of Gender
.
DET
78759 DET tokens (93% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (72091; 92%), Definite=Def (62512; 79%), Number=Sing (62068; 79%).
DET
tokens may have the following values of Gender
:
Fem
(32385; 41% of non-emptyGender
): la, las, una, esta, esa, todas, estas, otras, toda, otraMasc
(46374; 59% of non-emptyGender
): el, los, un, este, todo, ese, todos, otros, estos, unosEMPTY
(5672): su, sus, cada, mi, cualquier, qué, tal, mis, diferentes, tu
Paradigm el | Masc | Fem |
---|---|---|
Foreign=Yes|Number=Sing | la | |
Foreign=Yes|Number=Plur | les | les |
Number=Sing | el | la |
Number=Plur | los, els | las |
ADJ
24227 ADJ tokens (67% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: VerbForm=EMPTY (17726; 73%), Number=Sing (17367; 72%).
ADJ
tokens may have the following values of Gender
:
Fem
(10124; 42% of non-emptyGender
): primera, nueva, segunda, política, española, última, nuevas, única, buena, públicaMasc
(14103; 58% of non-emptyGender
): pasado, primer, nuevo, próximo, últimos, español, segundo, último, único, políticoEMPTY
(12200): gran, mayor, mejor, general, posible, ex, grandes, actual, electoral, internacional
Paradigm primero | Masc | Fem |
---|---|---|
Number=Sing | primer, primero | primera |
Number=Plur | primeros | primeras |
PRON
5804 PRON tokens (23% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (5803; 100%), Number=Sing (4355; 75%), PronType=Prs (3553; 61%), Person=3 (3444; 59%), PrepCase=EMPTY (3168; 55%).
PRON
tokens may have the following values of Gender
:
Fem
(1191; 21% of non-emptyGender
): la, una, ella, las, ellas, otra, ésta, unas, otras, algunasMasc
(4613; 79% of non-emptyGender
): lo, uno, todo, él, ellos, ello, unos, los, otros, todosEMPTY
(19380): que, se, le, me, nos, quien, les, eso, nada, qué
Paradigm él | Masc | Fem |
---|---|---|
Case=Acc,Nom|Number=Sing | él, ello | ella |
Case=Acc,Nom|Number=Plur | ellos | ellas |
Case=Acc|Definite=Def|Number=Sing|PrepCase=Npr | lo | |
Case=Acc|Definite=Ind|Number=Sing|PrepCase=Npr | LO | |
Case=Acc|Number=Sing|PrepCase=Npr | lo | la |
Case=Acc|Number=Plur|PrepCase=Npr | los | las |
Case=Nom|Number=Sing | Ella |
VERB
4754 VERB tokens (10% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (4753; 100%), Person=EMPTY (4753; 100%), Tense=Past (4753; 100%), VerbForm=Part (4753; 100%), Number=Sing (4434; 93%).
VERB
tokens may have the following values of Gender
:
Fem
(333; 7% of non-emptyGender
): aprobada, considerada, dada, utilizada, comprada, dadas, incluida, rechazada, recibida, violadaMasc
(4421; 93% of non-emptyGender
): hecho, tenido, dado, visto, conseguido, pasado, ganado, llegado, perdido, logradoEMPTY
(43432): tiene, dijo, hay, hace, hacer, tienen, aseguró, dar, explicó, tener
Paradigm hacer | Masc | Fem |
---|---|---|
Number=Sing | hecho | hecha |
Number=Plur | hechos |
AUX
481 AUX tokens (4% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (481; 100%), Number=Sing (481; 100%), Person=EMPTY (481; 100%), Tense=Past (480; 100%), VerbForm=Part (480; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(481; 100% of non-emptyGender
): sido, podido, estado, debido, serEMPTY
(13084): es, ha, han, fue, ser, son, está, puede, había, era
NUM
290 NUM tokens (3% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Card (290; 100%), NumForm=Word (289; 100%), Number=Plur (176; 61%).
NUM
tokens may have the following values of Gender
:
Fem
(93; 32% of non-emptyGender
): ambas, media, una, DECENAS, quinientasMasc
(197; 68% of non-emptyGender
): ambos, medio, un, doscientos, uno, miles, quinientos, dois, euros, ochentaEMPTY
(8884): dos, ciento, tres, cinco, cuatro, seis, 20, 30, siete, 10
Paradigm ambos | Masc | Fem |
---|---|---|
ambos | ambas |
PROPN
2 PROPN tokens (0% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Fem
(2; 100% of non-emptyGender
): Cuba, LletresEMPTY
(42388): Gobierno, España, Madrid, Barcelona, José, Estado, PP, Juan, Nacional, Estados
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (57022; 85%),
NOUN –[amod]–> ADJ (16842; 63%),
NOUN –[conj]–> NOUN (2526; 54%),
DET –[det]–> DET (1041; 85%),
NOUN –[appos]–> NOUN (928; 51%),
ADJ –[det]–> DET (688; 63%),
ADJ –[nsubj]–> NOUN (599; 57%),
ADJ –[conj]–> ADJ (569; 55%),
PRON –[nmod]–> NOUN (439; 74%),
ADJ –[det]–> PRON (159; 62%).