Treebank Statistics: UD_Portuguese-Bosque: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
109130 tokens (48%) have a non-empty value of Gender
.
18851 types (73%) occur at least once with a non-empty value of Gender
.
14461 lemmas (80%) occur at least once with a non-empty value of Gender
.
The feature is used with 13 part-of-speech tags: NOUN (41220; 18% instances), DET (34574; 15% instances), PROPN (11532; 5% instances), ADJ (11338; 5% instances), PRON (6713; 3% instances), VERB (3537; 2% instances), NUM (166; 0% instances), X (17; 0% instances), ADV (14; 0% instances), AUX (9; 0% instances), SCONJ (6; 0% instances), ADP (3; 0% instances), PART (1; 0% instances).
NOUN
41220 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (29542; 72%).
NOUN
tokens may have the following values of Gender
:
Fem
(18805; 46% of non-emptyGender
): pessoas, parte, semana, vez, empresa, forma, empresas, cidade, casa, vidaMasc
(22415; 54% of non-emptyGender
): anos, presidente, ano, dia, país, estado, tempo, contos, grupo, governoEMPTY
(166): partir, especialistas, representantes, jornalistas, jovens, estudantes, habitantes, par, visitantes, Esposende
Paradigm dia | Masc | Fem |
---|---|---|
Number=Sing | dia | dia |
Number=Plur | dias |
Gender
seems to be lexical feature of NOUN
. 98% lemmas (6592) occur only with one value of Gender
.
DET
34574 DET tokens (99% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (30779; 89%), Definite=Def (27460; 79%), Number=Sing (27254; 79%).
DET
tokens may have the following values of Gender
:
Fem
(15753; 46% of non-emptyGender
): a, as, uma, sua, esta, suas, essa, toda, outras, algumasMasc
(18821; 54% of non-emptyGender
): o, os, um, seu, este, seus, esse, todos, outros, outroEMPTY
(291): a, as, o, mais, qual, qualquer, tal, cada, que, um
Paradigm o | Masc | Fem |
---|---|---|
Definite=Def|ExtPos=PROPN|Number=Plur|PronType=Art | As | |
Definite=Def|Number=Sing|PronType=Art | o, Os, a, o(s) | a |
Definite=Def|Number=Sing|PronType=Art|Typo=Yes | os | o |
Definite=Def|Number=Plur|PronType=Art | os, o | as |
Definite=Def|Number=Plur|PronType=Art|Typo=Yes | o | a, As |
Definite=Ind|Number=Sing|PronType=Art | o | |
ExtPos=PROPN|Number=Sing|PronType=Art | O | |
Number=Sing|PronType=Art | o, A | a |
Number=Sing|PronType=Dem | o | a |
Number=Plur|PronType=Art | os | as |
Number=Plur|PronType=Dem | os | as |
PROPN
11532 PROPN tokens (61% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (11116; 96%), ExtPos=EMPTY (7643; 66%).
PROPN
tokens may have the following values of Gender
:
Fem
(3757; 33% of non-emptyGender
): Lisboa, Folha, Câmara, Alemanha, França, Comissão, Espanha, Europa, Rússia, ItáliaMasc
(7775; 67% of non-emptyGender
): São, Portugal, Brasil, José, Governo, EUA, Rio, Estados, João, PÚBLICOEMPTY
(7225): Paulo, Nacional, Unidos, Silva, Porto, Henrique, Lisboa, Sul, Costa, República
Paradigm São | Masc | Fem |
---|---|---|
Abbr=Yes|ExtPos=PROPN|Number=Sing | S. | |
Abbr=Yes|Number=Sing | S. | |
ExtPos=PROPN | SÃO | |
ExtPos=PROPN|Number=Sing | São, SÃO | São |
Number=Sing | São |
Gender
seems to be lexical feature of PROPN
. 95% lemmas (4385) occur only with one value of Gender
.
ADJ
11338 ADJ tokens (99% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (8184; 72%).
ADJ
tokens may have the following values of Gender
:
Fem
(5213; 46% of non-emptyGender
): primeira, nova, maior, grande, última, mesma, boa, segunda, política, passadaMasc
(6125; 54% of non-emptyGender
): primeiro, novo, mesmo, passado, último, segundo, últimos, bom, maior, grandeEMPTY
(60): melhor, capaz, Nacional, contente, especial, favorável, inconvenientes, jovens, mole, Aérea
Paradigm novo | Masc | Fem |
---|---|---|
Number=Sing | novo | nova |
Number=Plur | novos | novas |
PRON
6713 PRON tokens (90% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (4951; 74%), Case=EMPTY (4722; 70%), Person=EMPTY (4570; 68%).
PRON
tokens may have the following values of Gender
:
Fem
(1961; 29% of non-emptyGender
): que, se, a, ela, onde, as, elas, esta, lhe, euMasc
(4752; 71% of non-emptyGender
): que, se, o, ele, isso, tudo, eles, os, lhe, ondeEMPTY
(753): se, quem, me, nos, que, eu, você, nós, si, onde
Paradigm que | Masc | Fem |
---|---|---|
Case=Acc|Number=Sing|Person=3|PronType=Int | Que | |
Definite=Def|Number=Sing|PronType=Art | que | |
Number=Sing|PronType=Dem | que | |
Number=Sing|PronType=Ind | que | que |
Number=Sing|PronType=Int | que | que |
Number=Sing|PronType=Rel | que | que |
Number=Sing|PronType=Rel|Typo=Yes | qu | |
Number=Plur|PronType=Ind | que | |
Number=Plur|PronType=Int | que | que |
Number=Plur|PronType=Rel | que | que |
VERB
3537 VERB tokens (17% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Person=EMPTY (3536; 100%), Tense=EMPTY (3536; 100%), Mood=EMPTY (3535; 100%), VerbForm=Part (3534; 100%), Number=Sing (2329; 66%).
VERB
tokens may have the following values of Gender
:
Fem
(1435; 41% of non-emptyGender
): feita, feitas, considerada, criada, realizada, apresentada, dada, utilizada, marcada, aprovadaMasc
(2102; 59% of non-emptyGender
): feito, eleito, aberto, considerado, ligados, realizado, acusado, divulgado, entregue, feitosEMPTY
(17229): tem, há, disse, pode, fazer, diz, ter, é, deve, está
Paradigm ter | Masc | Fem |
---|---|---|
Number=Sing | tido | |
Number=Sing|Voice=Pass | tido | tida |
Number=Plur | tidas |
NUM
166 NUM tokens (4% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=Mult (131; 79%).
NUM
tokens may have the following values of Gender
:
Fem
(5; 3% of non-emptyGender
): dezenas, 13, 16, 4ªMasc
(161; 97% of non-emptyGender
): cento, milhões, meia, dúzia, milhares, 1, 1., 14,667, 185/60, MilEMPTY
(4494): um, dois, três, mil, milhões, uma, duas, quatro, cinco, 15
Gender
seems to be lexical feature of NUM
. 100% lemmas (21) occur only with one value of Gender
.
X
17 X tokens (10% of all X
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which X
and Gender
co-occurred: Number=Sing (16; 94%).
X
tokens may have the following values of Gender
:
Fem
(5; 29% of non-emptyGender
): made, Body, morcilla, naturaMasc
(12; 71% of non-emptyGender
): Dream, Insight, MacMillan, consejero, dolce, godfather, kebab, killer, line, primitiveEMPTY
(146): in, pole, position, jet, art, body, center, computing, drag, dream
Gender
seems to be lexical feature of X
. 100% lemmas (16) occur only with one value of Gender
.
ADV
14 ADV tokens (0% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: Polarity=EMPTY (13; 93%).
ADV
tokens may have the following values of Gender
:
Fem
(2; 14% of non-emptyGender
): quanto, talMasc
(12; 86% of non-emptyGender
): quanto, entanto, inteligente-, menos, não, ontem, teatral, umEMPTY
(8371): não, mais, já, também, ainda, ontem, só, depois, muito, como
Paradigm quanto | Masc | Fem |
---|---|---|
PronType=Ind | quanto | |
PronType=Int | quanto | |
PronType=Rel | quanto |
AUX
9 AUX tokens (0% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (9; 100%), Number=Sing (9; 100%), Person=EMPTY (9; 100%), Tense=EMPTY (9; 100%), VerbForm=Part (9; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(9; 100% of non-emptyGender
): sidoEMPTY
(5018): é, foi, ser, são, está, foram, vai, era, ter, será
SCONJ
6 SCONJ tokens (0% of all SCONJ
tokens) have a non-empty value of Gender
.
SCONJ
tokens may have the following values of Gender
:
Fem
(3; 50% of non-emptyGender
): Uma, que, unsMasc
(3; 50% of non-emptyGender
): queEMPTY
(5352): que, a, de, para, se, porque, como, por, em, quando
Paradigm que | Masc | Fem |
---|---|---|
que | ||
PronType=Rel | que | que |
ADP
3 ADP tokens (0% of all ADP
tokens) have a non-empty value of Gender
.
ADP
tokens may have the following values of Gender
:
Masc
(3; 100% of non-emptyGender
): de, queEMPTY
(33781): de, em, a, por, com, para, como, entre, sobre, até
PART
1 PART tokens (33% of all PART
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PART
and Gender
co-occurred: ExtPos=EMPTY (1; 100%), Number=Sing (1; 100%).
PART
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): pósEMPTY
(2): anti, pré
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (28280; 100%),
NOUN –[amod]–> ADJ (8998; 100%),
PROPN –[det]–> DET (4454; 81%),
NOUN –[acl]–> VERB (1597; 67%),
NOUN –[conj]–> NOUN (1383; 60%),
NOUN –[appos]–> PROPN (1216; 90%),
PROPN –[conj]–> PROPN (811; 75%),
VERB –[nsubj:pass]–> NOUN (572; 79%),
ADJ –[nsubj]–> NOUN (435; 97%),
ADJ –[conj]–> ADJ (385; 98%).