Treebank Statistics: UD_Romanian-RRT: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
90308 tokens (41%) have a non-empty value of Gender
.
24196 types (77%) occur at least once with a non-empty value of Gender
.
12077 lemmas (70%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (52827; 24% instances), ADJ (14474; 7% instances), DET (10401; 5% instances), VERB (7634; 3% instances), PRON (3079; 1% instances), NUM (940; 0% instances), AUX (631; 0% instances), PROPN (322; 0% instances).
NOUN
52827 NOUN tokens (97% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (38593; 73%), Case=Acc,Nom (28805; 55%), Definite=Def (27198; 51%).
NOUN
tokens may have the following values of Gender
:
Fem
(32519; 62% of non-emptyGender
): conformitate, membre, statele, Comisia, parte, față, partea, fața, comisiei, urmăMasc
(20308; 38% of non-emptyGender
): ani, timp, cazul, loc, timpul, mod, acord, b, lucru, cadrulEMPTY
(1431): art., a., nr., CE, b., mg, lit., alin., ml, CEE
Paradigm timp | Masc | Fem |
---|---|---|
Case=Acc,Nom|Definite=Def|Number=Sing | timpul | |
Case=Acc,Nom|Definite=Def|Number=Sing|Variant=Short | timpu' | |
Case=Acc,Nom|Definite=Def|Number=Plur | timpurile | |
Case=Dat,Gen|Definite=Def|Number=Sing | timpului | |
Definite=Ind|Number=Sing | timp | |
Definite=Ind|Number=Plur | timpuri |
Gender
seems to be lexical feature of NOUN
. 92% lemmas (7033) occur only with one value of Gender
.
ADJ
14474 ADJ tokens (95% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Degree=Pos (14434; 100%), Definite=Ind (13597; 94%), Number=Sing (9616; 66%), Case=EMPTY (8880; 61%).
ADJ
tokens may have the following values of Gender
:
Fem
(9207; 64% of non-emptyGender
): europene, necesare, prezenta, europeană, mică, naționale, română, chimice, prezentei, maximăMasc
(5267; 36% of non-emptyGender
): prezentul, nou, european, prezentului, general, mic, național, bun, românesc, singurEMPTY
(824): mare, asemenea, mari, mici, standard, noi, vechi, anume, românești, roșii
Paradigm mare | Masc | Fem |
---|---|---|
Case=Acc,Nom|Definite=Def|Number=Sing | marele | marea |
Case=Acc,Nom|Definite=Def|Number=Plur | marii | marile |
Case=Dat,Gen|Definite=Def|Number=Sing | marelui | Marii |
Case=Dat,Gen|Definite=Ind|Number=Sing | mari |
DET
10401 DET tokens (86% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Position=EMPTY (8973; 86%), Number=Sing (8609; 83%), Person=EMPTY (7680; 74%), Poss=EMPTY (7070; 68%), Case=Acc,Nom (6028; 58%), PronType=Ind (5340; 51%).
DET
tokens may have the following values of Gender
:
Fem
(6252; 60% of non-emptyGender
): o, a, ale, unei, toate, această, aceste, cele, alte, multeMasc
(4149; 40% of non-emptyGender
): un, al, unui, acest, cel, său, ai, același, cei, acestuiEMPTY
(1624): lui, lor, orice, unor, fiecare, ei, acestor, niște, tuturor, celor
Paradigm un | Masc | Fem |
---|---|---|
Case=Acc,Nom|Number=Sing | un, -un | o, -o |
Case=Acc,Nom|Number=Plur|Person=3|Position=Prenom | unii | unele |
Case=Dat,Gen|Number=Sing | unui | unei |
VERB
7634 VERB tokens (33% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (7634; 100%), Person=EMPTY (7634; 100%), Tense=EMPTY (7634; 100%), VerbForm=Part (7634; 100%), Number=Sing (5575; 73%).
VERB
tokens may have the following values of Gender
:
Fem
(2845; 37% of non-emptyGender
): prevăzute, menționate, prevăzută, stabilite, legate, utilizate, prezentate, asociate, puse, obținuteMasc
(4789; 63% of non-emptyGender
): avut, făcut, spus, putut, rupt, dat, murit, devenit, luat, rănitEMPTY
(15365): poate, trebuie, pot, putea, avea, are, face, era, există, au
Paradigm avea | Masc | Fem |
---|---|---|
Number=Sing | avut | avută |
Number=Plur | avute |
PRON
3079 PRON tokens (26% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Reflex=EMPTY (3079; 100%), Person=3 (3063; 99%), Variant=EMPTY (2646; 86%), Number=Sing (2173; 71%), Case=Acc,Nom (1930; 63%), PronType=Prs (1614; 52%).
PRON
tokens may have the following values of Gender
:
Fem
(1547; 50% of non-emptyGender
): o, le, ea, ceea, aceasta, acestea, -o, una, ele, toateMasc
(1532; 50% of non-emptyGender
): el, -l, îl, unul, ei, l-, acesta, cel, cei, acestuiaEMPTY
(8728): se, care, ce, s-, își, -și, și-, îi, -se, -i
Paradigm el | Masc | Fem |
---|---|---|
Case=Acc,Nom|Number=Sing|Strength=Strong | el | ea |
Case=Acc,Nom|Number=Plur|Strength=Strong | ei | ele |
Case=Acc|Number=Sing|Strength=Weak | îl | o |
Case=Acc|Number=Sing|Strength=Weak|Variant=Short | -l, l-, l | -o |
Case=Acc|Number=Plur|Strength=Weak | îi, i | le |
Case=Acc|Number=Plur|Strength=Weak|Variant=Short | -i, i- | le-, -le |
Case=Dat,Gen|Number=Sing|Strength=Strong | lui | ei |
Case=Dat|Number=Plur|Strength=Weak|Variant=Short | -i |
NUM
940 NUM tokens (17% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumForm=Word (892; 95%), Number=Plur (483; 51%), NumType=Ord (472; 50%).
NUM
tokens may have the following values of Gender
:
Fem
(579; 62% of non-emptyGender
): două, prima, doua, primele, milioane, o, ambele, mii, treia, ultimeleMasc
(361; 38% of non-emptyGender
): primul, doi, doilea, ultimii, un, ultimul, unu, primului, amândoi, prim-EMPTY
(4609): 1, 2, 3, 4, trei, 5, 6, 7, 8, i
Paradigm doi | Masc | Fem |
---|---|---|
Number=Sing|NumType=Ord | doilea, secund | doua |
Number=Plur|NumType=Card | doi | două |
AUX
631 AUX tokens (7% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (631; 100%), Number=Sing (631; 100%), Person=EMPTY (631; 100%), Tense=EMPTY (631; 100%), VerbForm=Part (631; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(631; 100% of non-emptyGender
): fostEMPTY
(7934): a, este, au, sunt, fi, era, va, ar, am, fie
PROPN
322 PROPN tokens (5% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Fem
(254; 79% of non-emptyGender
): României, Moldovei, Dunării, Europei, Franței, Italiei, Norvegiei, Rusiei, Ungariei, GermanieiMasc
(68; 21% of non-emptyGender
): Carpaților, Iașilor, Jiului, Banatul, Iașii, Israelul, Israelului, Aradului, Banatului, BucureștiuluiEMPTY
(5563): România, Winston, București, Timișoara, Iași, Ion, Paris, Alexandru, O’Brien, Moldova
Gender
seems to be lexical feature of PROPN
. 100% lemmas (104) occur only with one value of Gender
.
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (11682; 95%),
NOUN –[nmod]–> NOUN (8995; 54%),
NOUN –[det]–> DET (8122; 78%),
NOUN –[conj]–> NOUN (2494; 73%),
VERB –[nsubj:pass]–> NOUN (1025; 60%),
ADJ –[conj]–> ADJ (662; 93%),
VERB –[conj]–> VERB (528; 61%),
ADJ –[nsubj]–> NOUN (383; 91%),
VERB –[obl:agent]–> NOUN (345; 51%),
NOUN –[appos]–> NOUN (308; 57%).