Treebank Statistics: UD_Italian-MarkIT: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
17767 tokens (44%) have a non-empty value of Gender
.
3855 types (64%) occur at least once with a non-empty value of Gender
.
2915 lemmas (71%) occur at least once with a non-empty value of Gender
.
The feature is used with 11 part-of-speech tags: NOUN (7399; 18% instances), DET (5737; 14% instances), ADJ (2454; 6% instances), PRON (1391; 3% instances), VERB (718; 2% instances), AUX (61; 0% instances), PROPN (3; 0% instances), ADP (1; 0% instances), ADV (1; 0% instances), CCONJ (1; 0% instances), SCONJ (1; 0% instances).
NOUN
7399 NOUN tokens (99% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (5598; 76%).
NOUN
tokens may have the following values of Gender
:
Fem
(3509; 47% of non-emptyGender
): vita, società, persone, scienza, felicità, parte, amicizia, ricerca, storia, filosofiaMasc
(3890; 53% of non-emptyGender
): uomo, tempo, esempio, anni, modo, mondo, amico, paese, stato, motivoEMPTY
(54): grazie, riconoscere, vivere, Domani, Estero, Museo, Stato, Uomo, aldilá, avanzare
Paradigm grazie | Masc | Fem |
---|---|---|
grazie | grazie |
Gender
seems to be lexical feature of NOUN
. 98% lemmas (1728) occur only with one value of Gender
.
DET
5737 DET tokens (87% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (4632; 81%), Number=Sing (4398; 77%), Definite=Def (3780; 66%).
DET
tokens may have the following values of Gender
:
Fem
(2515; 44% of non-emptyGender
): la, le, una, questa, sua, l’, nostra, queste, sue, propriaMasc
(3222; 56% of non-emptyGender
): il, un, i, gli, questo, lo, suo, questi, ogni, l’EMPTY
(825): l’, un’, tale, che, più, tali, un, cui, dei, delle
Paradigm il | Masc | Fem |
---|---|---|
Definite=Def|Number=Sing | La | |
Definite=Def|Number=Sing|PronType=Art | il, lo, l' | la, l', lo |
Definite=Def|Number=Plur|PronType=Art | i, gli, il | le, la |
Number=Sing | il | la, L' |
Number=Plur | i | le |
ADJ
2454 ADJ tokens (96% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (1749; 71%).
ADJ
tokens may have the following values of Gender
:
Fem
(787; 32% of non-emptyGender
): stessa, diverse, moderna, prima, seconda, unica, italiana, umana, nuova, nuoveMasc
(1667; 68% of non-emptyGender
): stesso, grande, importante, primo, umano, possibile, difficile, piccolo, grandi, veroEMPTY
(103): altri, altro, maggiore, altre, maggior, superiore, maggiori, migliore, pochi, III
Paradigm grande | Masc | Fem |
---|---|---|
Degree=Abs|Number=Sing | grandissima | |
Number=Sing | grande, gran | |
Number=Plur | grandi | grandi |
PRON
1391 PRON tokens (45% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Person=3 (1147; 82%), Number=Sing (1032; 74%), PronType=Prs (857; 62%), Clitic=Yes (747; 54%).
PRON
tokens may have the following values of Gender
:
Fem
(392; 28% of non-emptyGender
): ci, la, questa, vi, essa, quella, le, mi, qualcosa, séMasc
(999; 72% of non-emptyGender
): si, lo, questo, ci, ciò, quello, tutti, altri, lui, tuttoEMPTY
(1728): che, si, c’, noi, cui, ne, quale, chi, quali, ci
Paradigm lo | Masc | Fem |
---|---|---|
Clitic=Yes|Number=Sing|Person=3 | la | |
Clitic=Yes|Number=Sing|Person=3|PronType=Prs | lo, l', gli, li | la, le |
Clitic=Yes|Number=Plur|Person=3|PronType=Prs | li, gli | le |
Definite=Def|Number=Sing | lo | |
Definite=Def|Number=Sing|PronType=Art | lo | la |
Number=Sing | la |
VERB
718 VERB tokens (18% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Person=EMPTY (718; 100%), Mood=EMPTY (717; 100%), VerbForm=Part (714; 99%), Tense=Past (710; 99%), Number=Sing (534; 74%).
VERB
tokens may have the following values of Gender
:
Fem
(217; 30% of non-emptyGender
): creata, fatta, porta, sviluppata, considerata, vista, avuta, composta, fatte, sentitaMasc
(501; 70% of non-emptyGender
): avuto, dato, fatto, visto, inteso, stato, cercato, detto, legati, permessoEMPTY
(3223): è, ha, far, sono, fa, fare, essere, trovare, dare, avere
Paradigm essere | Masc | Fem |
---|---|---|
Number=Sing | stato | stata |
Number=Plur | stati | state |
AUX
61 AUX tokens (3% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (61; 100%), Person=EMPTY (61; 100%), Tense=Past (61; 100%), VerbForm=Part (61; 100%), Number=Sing (47; 77%).
AUX
tokens may have the following values of Gender
:
Fem
(22; 36% of non-emptyGender
): stata, state, potutaMasc
(39; 64% of non-emptyGender
): stato, stati, potuto, volutoEMPTY
(1979): è, sono, ha, può, essere, hanno, era, possiamo, fu, deve
Paradigm essere | Masc | Fem |
---|---|---|
Number=Sing | stato | stata |
Number=Plur | stati | state |
PROPN
3 PROPN tokens (0% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Masc
(3; 100% of non-emptyGender
): Human, brain, projectEMPTY
(829): Italia, Europa, Germania, Unione, europea, America, Leopardi, Malpelo, Pascal, Romeo
ADP
1 ADP tokens (0% of all ADP
tokens) have a non-empty value of Gender
.
ADP
tokens may have the following values of Gender
:
Fem
(1; 100% of non-emptyGender
): aEMPTY
(5459): di, in, a, da, per, con, su, come, ad, tra
ADV
1 ADV tokens (0% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: PronType=EMPTY (1; 100%).
ADV
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): parecchioEMPTY
(2291): non, più, proprio, anche, sempre, solo, infatti, così, quindi, molto
CCONJ
1 CCONJ tokens (0% of all CCONJ
tokens) have a non-empty value of Gender
.
CCONJ
tokens may have the following values of Gender
:
Fem
(1; 100% of non-emptyGender
): eEMPTY
(1347): e, ma, ed, o, sia, oppure, quindi, né, ovvero, cioè
SCONJ
1 SCONJ tokens (0% of all SCONJ
tokens) have a non-empty value of Gender
.
SCONJ
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): perchéEMPTY
(867): che, se, perché, come, quando, poiché, mentre, nonostante, affinché, dopo
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (4737; 85%),
NOUN –[amod]–> ADJ (1537; 82%),
NOUN –[det:poss]–> DET (359; 93%),
NOUN –[conj]–> NOUN (312; 56%),
ADJ –[conj]–> ADJ (95; 84%),
NOUN –[nsubj]–> NOUN (67; 54%),
ADJ –[det]–> DET (65; 86%),
ADJ –[nsubj]–> NOUN (55; 68%),
NOUN –[det:predet]–> DET (53; 100%),
VERB –[nsubj:pass]–> NOUN (51; 94%).