Treebank Statistics: UD_Romanian-SiMoNERo: Features: Gender
This feature is universal.
It occurs with 2 different values: Fem
, Masc
.
69360 tokens (48%) have a non-empty value of Gender
.
14781 types (82%) occur at least once with a non-empty value of Gender
.
7672 lemmas (72%) occur at least once with a non-empty value of Gender
.
The feature is used with 8 part-of-speech tags: NOUN (39982; 27% instances), ADJ (16686; 11% instances), DET (6907; 5% instances), VERB (3889; 3% instances), PRON (1065; 1% instances), NUM (416; 0% instances), AUX (402; 0% instances), PROPN (13; 0% instances).
NOUN
39982 NOUN tokens (94% of all NOUN
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NOUN
and Gender
co-occurred: Number=Sing (29256; 73%), Definite=Def (21800; 55%), Case=Nom (20629; 52%).
NOUN
tokens may have the following values of Gender
:
Fem
(24935; 62% of non-emptyGender
): insulină, creșterea, vârsta, cazuri, scăderea, creștere, insulinei, prezența, vârstă, glucozăMasc
(15047; 38% of non-emptyGender
): pacienții, pacienți, ani, nivelul, diabet, risc, cazul, tip, tratamentul, tratamentEMPTY
(2716): mg, IC, vs, HTA, TA, DZ, FA, AVC, dl, EI
Paradigm caz | Masc | Fem |
---|---|---|
Case=Gen|Definite=Def|Number=Sing | cazului | |
Case=Gen|Definite=Def|Number=Plur | cazurilor | |
Case=Nom|Definite=Def|Number=Sing | cazul | |
Case=Nom|Definite=Def|Number=Plur | cazurile | |
Definite=Ind|Number=Sing | caz | |
Definite=Ind|Number=Plur | cazuri |
Gender
seems to be lexical feature of NOUN
. 94% lemmas (4053) occur only with one value of Gender
.
ADJ
16686 ADJ tokens (98% of all ADJ
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which ADJ
and Gender
co-occurred: Degree=Pos (16647; 100%), Definite=Ind (16460; 99%), Number=Sing (11573; 69%), Case=EMPTY (9861; 59%).
ADJ
tokens may have the following values of Gender
:
Fem
(10849; 65% of non-emptyGender
): mare, clinice, cardiacă, cardiace, renală, crescute, cronică, cardiovasculare, renale, crescutăMasc
(5837; 35% of non-emptyGender
): vârstnici, crescut, zaharat, mare, important, clinic, vascular, cardiac, lung, normalEMPTY
(366): mici, mari, precoce, noi, standard, postoperatorii, medii, operatorii, lungi, online
Paradigm mare | Masc | Fem |
---|---|---|
Case=Gen|Definite=Def|Number=Sing | marii | |
Case=Gen|Definite=Ind|Number=Sing | mari | |
Case=Nom|Definite=Def|Number=Sing | Marele | marea |
Case=Nom|Definite=Def|Number=Plur | marile | |
Case=Nom|Definite=Ind|Number=Sing | mare | |
Definite=Ind|Number=Sing | mare |
DET
6907 DET tokens (93% of all DET
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which DET
and Gender
co-occurred: Position=EMPTY (5899; 85%), Number=Sing (5606; 81%), Person=EMPTY (5475; 79%), Poss=EMPTY (3896; 56%).
DET
tokens may have the following values of Gender
:
Fem
(4410; 64% of non-emptyGender
): a, o, ale, această, unei, cele, aceste, alte, multe, ceaMasc
(2497; 36% of non-emptyGender
): un, al, unui, acest, cel, ai, acești, același, acestui, săiEMPTY
(517): unor, lor, acestor, orice, lui, celor, fiecare, altor, ei, căror
Paradigm al | Masc | Fem |
---|---|---|
Number=Sing | al | a |
Number=Plur | ai | ale |
VERB
3889 VERB tokens (38% of all VERB
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which VERB
and Gender
co-occurred: Mood=EMPTY (3889; 100%), Person=EMPTY (3889; 100%), Tense=EMPTY (3889; 100%), VerbForm=Part (3889; 100%), Number=Sing (2715; 70%).
VERB
tokens may have the following values of Gender
:
Fem
(1667; 43% of non-emptyGender
): asociată, legate, asociate, diagnosticate, efectuate, folosite, cauzată, considerate, utilizate, utilizatăMasc
(2222; 57% of non-emptyGender
): arătat, demonstrat, efectuat, avut, dovedit, constatat, prezentat, tratați, inclus, observatEMPTY
(6319): poate, pot, are, trebuie, există, au, reprezintă, crește, face, prezintă
Paradigm avea | Masc | Fem |
---|---|---|
Number=Sing | avut | avută |
Number=Plur | avute |
PRON
1065 PRON tokens (25% of all PRON
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which PRON
and Gender
co-occurred: Person=3 (1065; 100%), Reflex=EMPTY (1065; 100%), Case=Nom (879; 83%), Strength=EMPTY (858; 81%), PronType=Dem (691; 65%), Number=Sing (648; 61%).
PRON
tokens may have the following values of Gender
:
Fem
(669; 63% of non-emptyGender
): ceea, acestea, cea, cele, aceasta, aceea, ele, toate, o, eaMasc
(396; 37% of non-emptyGender
): cei, cel, acesta, aceștia, unul, el, acestuia, îl, ei, l-EMPTY
(3138): care, se, ce, s-, acestora, celor, își, li, cărora, ne
Paradigm care | Masc | Fem |
---|---|---|
căruia | căreia |
NUM
416 NUM tokens (9% of all NUM
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which NUM
and Gender
co-occurred: NumForm=Word (381; 92%), NumType=Ord (257; 62%), Number=Plur (210; 50%).
NUM
tokens may have the following values of Gender
:
Fem
(288; 69% of non-emptyGender
): două, prima, ambele, primele, doua, primă, ultima, primei, ultimele, treiaMasc
(128; 31% of non-emptyGender
): primul, doilea, ultimii, doi, ultimul, treilea, primii, ultimilor, ultimului, primEMPTY
(4189): 2, 1, 3, 4, 5, 30, 10, 20, 6, 15
Paradigm doi | Masc | Fem |
---|---|---|
doi | două |
Gender
seems to be lexical feature of NUM
. 92% lemmas (36) occur only with one value of Gender
.
AUX
402 AUX tokens (8% of all AUX
tokens) have a non-empty value of Gender
.
The most frequent other feature values with which AUX
and Gender
co-occurred: Mood=EMPTY (402; 100%), Number=Sing (402; 100%), Person=EMPTY (402; 100%), Tense=EMPTY (402; 100%), VerbForm=Part (402; 100%).
AUX
tokens may have the following values of Gender
:
Masc
(402; 100% of non-emptyGender
): fost, pututEMPTY
(4556): este, a, au, sunt, fi, fiind, ar, va, fie, am
PROPN
13 PROPN tokens (2% of all PROPN
tokens) have a non-empty value of Gender
.
PROPN
tokens may have the following values of Gender
:
Fem
(13; 100% of non-emptyGender
): Americii, Americă, Asiei, Europei, Franței, Greciei, RomânieiEMPTY
(704): Graves-Basedow, Doppler, Rubino, Europa, România, Langerhans, Paulescu, Pendred, Britanie, Esnaola
Relations with Agreement in Gender
The 10 most frequent relations where parent and child node agree in Gender
:
NOUN –[amod]–> ADJ (13586; 96%),
NOUN –[nmod]–> NOUN (8534; 50%),
NOUN –[det]–> DET (5035; 74%),
NOUN –[conj]–> NOUN (2752; 64%),
VERB –[nsubj:pass]–> NOUN (753; 62%),
ADJ –[nsubj]–> NOUN (633; 91%),
ADJ –[conj]–> ADJ (630; 93%),
NOUN –[acl]–> ADJ (376; 89%),
VERB –[obl:agent]–> NOUN (237; 54%),
ADJ –[det]–> DET (169; 93%).