Treebank Statistics: UD_Italian-TWITTIRO: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
9154 tokens (31%) have a non-empty value of Gender
.
2946 types (48%) occur at least once with a non-empty value of Gender
.
2329 lemmas (48%) occur at least once with a non-empty value of Gender
.
The feature is used with 10 part-of-speech tags: NOUN (4195; 14% instances), DET (3006; 10% instances), ADJ (981; 3% instances), VERB (505; 2% instances), PRON (443; 1% instances), AUX (19; 0% instances), ADV (2; 0% instances), PROPN (1; 0% instances), SYM (1; 0% instances), X (1; 0% instances).
NOUN
4195 NOUN tokens (94% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (2873; 68%).
NOUN
tokens may have the following values of Gender
:
Fem
(1848; 44% of non-emptyGender
): scuola, riforma, cosa, casa, crisi, vita, foto, volta, cit., fineMasc
(2347; 56% of non-emptyGender
): governo, anni, lavoro, anno, italiani, mesi, mondo, tagli, merito, ministroEMPTY
(260): RT, docenti, grazie, spread, inglese, insegnanti, rain, tweet, prof, hashtag
Paradigm ministro | Masc | Fem |
---|---|---|
Number=Sing | ministro | ministra |
Number=Plur | ministri |
Gender
seems to be lexical feature of NOUN
. 99% lemmas (1681) occur only with one value of Gender
.
DET
3006 DET tokens (87% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: PronType=Art (2808; 93%), Definite=Def (2394; 80%), Number=Sing (2287; 76%).
DET
tokens may have the following values of Gender
:
Fem
(1223; 41% of non-emptyGender
): la, le, una, un’, questa, sua, mia, tutte, quella, tuaMasc
(1783; 59% of non-emptyGender
): il, i, un, gli, lo, suo, tutti, mio, questo, unoEMPTY
(439): l’, che, l’, ogni, qualche, loro, tutto, quale, sto, tutti
Paradigm il | Masc | Fem |
---|---|---|
Number=Sing | il, lo, er | la, ka |
Number=Plur | i, gli | le |
ADJ
981 ADJ tokens (79% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Number=Sing (748; 76%).
ADJ
tokens may have the following values of Gender
:
Fem
(458; 47% of non-emptyGender
): buona, bella, italiana, pubblica, prima, unica, igienica, nuova, prime, nuoveMasc
(523; 53% of non-emptyGender
): nuovo, primo, buon, italiano, bel, caro, giusto, italiani, unico, belloEMPTY
(256): grande, ex, acid, possibile, elementari, miglior, facile, civili, fiscale, forte
Paradigm buono | Masc | Fem |
---|---|---|
Number=Sing | buon, buono | buona |
Number=Plur | buoni | buone |
VERB
505 VERB tokens (18% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (505; 100%), Person=EMPTY (505; 100%), Tense=Past (504; 100%), VerbForm=Part (504; 100%), Number=Sing (433; 86%).
VERB
tokens may have the following values of Gender
:
Fem
(85; 17% of non-emptyGender
): fatta, letta, interrogata, varata, iniziata, ritrovata, scritta, trovata, @user, BastaMasc
(420; 83% of non-emptyGender
): fatto, detto, morto, messo, avuto, dato, letto, arrivato, capito, lasciatoEMPTY
(2341): continua, fare, è, fa, ha, dire, dice, va, far, parla
Paradigm fare | Masc | Fem |
---|---|---|
fatto | fatta |
Gender
seems to be lexical feature of VERB
. 91% lemmas (243) occur only with one value of Gender
.
PRON
443 PRON tokens (31% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Number=Sing (321; 72%), Clitic=EMPTY (255; 58%), Person=EMPTY (247; 56%).
PRON
tokens may have the following values of Gender
:
Fem
(101; 23% of non-emptyGender
): la, quella, questa, le, lei, quelle, una, altra, mia, tanteMasc
(342; 77% of non-emptyGender
): lo, tutti, tutto, li, gli, quello, questo, altro, nessuno, qualcunoEMPTY
(973): si, che, ci, mi, chi, c’, ti, ne, noi, io
Paradigm lo | Masc | Fem |
---|---|---|
Number=Sing | lo, l', qual | la |
Number=Plur | li |
AUX
19 AUX tokens (2% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (19; 100%), Person=EMPTY (19; 100%), Tense=Past (19; 100%), VerbForm=Part (19; 100%), Number=Sing (18; 95%).
AUX
tokens may have the following values of Gender
:
Fem
(8; 42% of non-emptyGender
): stataMasc
(11; 58% of non-emptyGender
): stato, potuto, statiEMPTY
(1070): è, ha, sono, era, e’, siamo, hanno, ho, essere, sarà
Paradigm essere | Masc | Fem |
---|---|---|
Number=Sing | stato | stata |
Number=Plur | stati |
ADV
2 ADV tokens (0% of all ADV
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADV
and Gender
co-occurred: PronType=Ind (2; 100%).
ADV
tokens may have the following values of Gender
:
Masc
(2; 100% of non-emptyGender
): tuttiEMPTY
(1408): non, anche, più, ora, solo, poi, ancora, così, bene, già
PROPN
1 PROPN tokens (0% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): FollettoEMPTY
(2013): monti, mario, renzi, italia, pd, Berlusconi, Roma, Salvini, Papa, giannini
SYM
1 SYM tokens (0% of all SYM
tokens) have a non-empty value of Gender
.
SYM
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): #cambiaversoEMPTY
(2145): @user, #labuonascuola, #monti, @user1, @user2, #renzi, #scuola, @user3, http://t.co/oDPUtx2DvV, #Grillo
X
1 X tokens (1% of all X
tokens) have a non-empty value of Gender
.
X
tokens may have the following values of Gender
:
Masc
(1; 100% of non-emptyGender
): malEMPTY
(109): e, i, o, partes, super, zan, #labuonascuola, #tassadopotassa, 10cent, 13.mo
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[det]–> DET (2261; 85%),
NOUN –[amod]–> ADJ (637; 79%),
NOUN –[det:poss]–> DET (85; 88%),
VERB –[nsubj:pass]–> NOUN (48; 72%),
NOUN –[compound]–> NOUN (32; 63%),
ADJ –[nsubj]–> NOUN (30; 64%),
PRON –[det]–> DET (19; 66%),
NOUN –[nsubj]–> PRON (18; 53%),
NOUN –[parataxis]–> ADJ (16; 67%),
ADJ –[conj]–> ADJ (15; 65%).