Treebank Statistics: UD_Portuguese-PetroGold: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
131808 tokens (53%) have a non-empty value of Gender
.
11831 types (78%) occur at least once with a non-empty value of Gender
.
8284 lemmas (79%) occur at least once with a non-empty value of Gender
.
The feature is used with 10 part-of-speech tags: NOUN (57532; 23% instances), DET (36346; 15% instances), ADJ (17069; 7% instances), VERB (8783; 4% instances), PROPN (8299; 3% instances), PRON (3507; 1% instances), ADV (216; 0% instances), NUM (51; 0% instances), AUX (4; 0% instances), X (1; 0% instances).
NOUN
57532 NOUN tokens (100% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (41495; 72%).
NOUN
tokens may have the following values of Gender
:
Fem
(28734; 50% of non-emptyGender
): água, figura, produção, área, argila, perfuração, forma, pressão, formação, tabelaMasc
(28798; 50% of non-emptyGender
): óleo, fluido, petróleo, gás, fluidos, processo, dados, campo, sistema, tempoEMPTY
(30): place, Figura, Offshore, ,, Argila, Captura, Equação, Etanol, Petróleo, Processo
Paradigm óleo | Masc | Fem |
---|---|---|
Number=Sing | óleo | óleo |
Number=Plur | óleos |
Gender
seems to be lexical feature of NOUN
. 97% lemmas (3589) occur only with one value of Gender
.
DET
36346 DET tokens (100% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (31768; 87%), Definite=Def (29026; 80%), Number=Sing (28129; 77%).
DET
tokens may have the following values of Gender
:
Fem
(18204; 50% of non-emptyGender
): a, as, uma, esta, sua, estas, essa, suas, cada, essasMasc
(18142; 50% of non-emptyGender
): o, os, um, este, estes, esse, seu, esses, todos, cada
Paradigm o | Masc | Fem |
---|---|---|
Definite=Def|Number=Sing|PronType=Art | o | a, á |
Definite=Def|Number=Plur|PronType=Art | os | as, A |
Number=Sing | o | |
Number=Plur|PronType=Art | os |
ADJ
17069 ADJ tokens (100% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (11092; 65%).
ADJ
tokens may have the following values of Gender
:
Fem
(8545; 50% of non-emptyGender
): maior, grande, magnética, alta, baixa, menor, mesma, magnéticas, aquosa, continentalMasc
(8524; 50% of non-emptyGender
): magnético, maior, possível, necessário, magnéticos, natural, presente, diferentes, mesmo, totalEMPTY
(11): subsea, primeira, próximo
Paradigm maior | Masc | Fem |
---|---|---|
Number=Sing | maior | maior |
Number=Plur | maiores | maiores |
VERB
8783 VERB tokens (43% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (8782; 100%), Person=EMPTY (8782; 100%), Tense=EMPTY (8782; 100%), VerbForm=Part (8775; 100%), Number=Sing (5169; 59%), Voice=EMPTY (4792; 55%).
VERB
tokens may have the following values of Gender
:
Fem
(3791; 43% of non-emptyGender
): utilizada, produzida, utilizadas, realizada, feita, obtidas, obtida, observada, associadas, observadasMasc
(4992; 57% of non-emptyGender
): devido, utilizado, utilizados, obtidos, apresentados, observado, realizados, obtido, associados, realizadoEMPTY
(11576): pode, podem, partir, apresenta, utilizando, tem, apresentam, deve, mostra, ocorre
Paradigm utilizar | Masc | Fem |
---|---|---|
Number=Sing|VerbForm=Ger | utilizado | |
Number=Sing|VerbForm=Part | utilizado | utilizada, utilizado |
Number=Sing|VerbForm=Part|Voice=Pass | utilizado | utilizada |
Number=Plur|VerbForm=Part | utilizados | utilizadas |
Number=Plur|VerbForm=Part|Voice=Pass | utilizados | utilizadas |
PROPN
8299 PROPN tokens (69% of all PROPN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PROPN
and Gender
co-occurred: Number=Sing (8080; 97%).
PROPN
tokens may have the following values of Gender
:
Fem
(2688; 32% of non-emptyGender
): Bacia, Formação, NE-SW, MEG, ilha, Petrobras, ANP, NW-SE, Fm, GomaMasc
(5611; 68% of non-emptyGender
): CO2, C, Membro, Brasil, Rio, Grupo, Campos, PHPA, GX, MDLEMPTY
(3714): et, al., Cabo, Frio, &, Santos, Grande, Romualdo, Campos, São
Paradigm NE-SW | Masc | Fem |
---|---|---|
Number=Sing | NE-SW | NE-SW |
Number=Plur | NE-SW |
PRON
3507 PRON tokens (65% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (2395; 68%), PronType=Rel (1986; 57%).
PRON
tokens may have the following values of Gender
:
Fem
(1219; 35% of non-emptyGender
): que, a, uma, esta, elas, ela, qual, as, estas, mesmaMasc
(2288; 65% of non-emptyGender
): que, o, isso, isto, este, um, qual, eles, mesmo, estesEMPTY
(1892): se, nos, que, nós, um
Paradigm que | Masc | Fem |
---|---|---|
Number=Sing | que | que |
Number=Plur | que | que |
ADV
216 ADV tokens (3% of all ADV
tokens) have a non-empty value of Gender
.
ADV
tokens may have the following values of Gender
:
Fem
(86; 40% of non-emptyGender
): onde, SIM, melhorMasc
(130; 60% of non-emptyGender
): ondeEMPTY
(6226): mais, não, também, através, já, muito, assim, bem, ainda, além
Paradigm onde | Masc | Fem |
---|---|---|
Number=Sing | Onde | |
Number=Sing|PronType=Rel | onde | onde |
Number=Plur|PronType=Rel | onde | onde |
NUM
51 NUM tokens (1% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumType=EMPTY (31; 61%).
NUM
tokens may have the following values of Gender
:
Fem
(4; 8% of non-emptyGender
): II.7, II.7.2, II.8.1, noveMasc
(47; 92% of non-emptyGender
): III.2, 36º, 43º, 44,6º, 80º, 8º, II.1, II.2.3, II.3, II.4.1EMPTY
(7239): 1, dois, 3, 2, 5, 10, duas, 4, três, 2005
Gender
seems to be lexical feature of NUM
. 100% lemmas (50) occur only with one value of Gender
.
AUX
4 AUX tokens (0% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (4; 100%), Number=Sing (4; 100%), Person=EMPTY (4; 100%), Tense=EMPTY (4; 100%), VerbForm=Part (4; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(4; 100% of non-emptyGender
): sidoEMPTY
(6570): é, são, foi, ser, foram, sendo, estão, está, será, serão
X
1 X tokens (0% of all X
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which X
and Gender
co-occurred: Foreign=EMPTY (1; 100%).
X
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): drill-inEMPTY
(215): in, drill, n, flow, core, ., booster, pin, situ, stripe
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (32967; 100%),
NOUN –[amod]–> ADJ (14549; 100%),
NOUN –[acl]–> VERB (4069; 93%),
NOUN –[conj]–> NOUN (2656; 61%),
VERB –[nsubj:pass]–> NOUN (2123; 77%),
PROPN –[det]–> DET (2112; 99%),
NOUN –[nmod]–> PROPN (1915; 61%),
ADJ –[obl]–> NOUN (713; 54%),
ADJ –[nsubj]–> NOUN (663; 91%),
PROPN –[conj]–> PROPN (663; 71%).