Treebank Statistics: UD_Romanian-Nonstandard: POS Tags: NOUN
There are 6417 NOUN
lemmas (46%), 14523 NOUN
types (42%) and 96783 NOUN
tokens (17%).
Out of 16 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: domn, vodă, om, țară, zi, cuvânt, lucru, turc, oaste, frate
The 10 most frequent NOUN
types: vodă, domnul, doamne, țara, țară, omul, om, domnului, oaste, cuvîntul
The 10 most frequent ambiguous lemmas: domn (NOUN 2499, PROPN 18, VERB 1), vodă (NOUN 1962, VERB 1), om (NOUN 1728, PROPN 3, NUM 1), țară (NOUN 1431, PROPN 6, ADJ 5, ADV 1, VERB 1), zi (NOUN 1004, ADV 17, PROPN 4, VERB 3, ADP 1), turc (NOUN 694, PROPN 8), împărat (NOUN 667, ADV 4, PROPN 3), boier (NOUN 664, PROPN 5), vreme (NOUN 637, PROPN 1), lume (NOUN 623, PRON 1, PROPN 1, VERB 1)
The 10 most frequent ambiguous types: vodă (NOUN 1941, PROPN 1, VERB 1), omul (NOUN 411, PROPN 1), om (NOUN 390, AUX 42, NUM 1), parte (NOUN 293, ADV 1), numele (NOUN 261, ADP 1), fiiul (NOUN 78, PROPN 1), domnu (NOUN 244, PROPN 8), credință (NOUN 251, VERB 1), lucru (NOUN 248, VERB 2), duhul (NOUN 74, ADP 1)
- vodă
- omul
- om
- parte
- numele
- fiiul
- domnu
- credință
- lucru
- duhul
Morphology
The form / lemma ratio of NOUN
is 2.263207 (the average of all parts of speech is 2.491875).
The 1st highest number of forms (43) was observed with the lemma “zi”: dni, dza, dze, dzi, dzii, dzile, dzilele, dzileli, dzili, dzilile, dzio, dzioa, dziua, dzua, dzuei, dzuo, dzuoa, dzuîi, dzuă, dzî, dzîle, dzîlele, dzîli, dzîlile, zi, zile, zileei, zilei, zilele, zilelor, zio, zioa, ziua, ziuă, zoa, zua, zuo, zuoa, zuă, zâua, zî, zîle, zîlele.
The 2nd highest number of forms (38) was observed with the lemma “împărăție”: -mpărațîi, -mpărăţia, -mpărăţie, -mpărăție, -mpărățiia, -mpărățîia, -mpărățîie, Impărățiia, mpărățiia, mpărățîie, npărăție, părățiia, împărăţia, împărăția, împărăție, împărăției, împărății, împărățiia, împărățiii, împărățiile, împărățiilor, împărățâe, împărățâei, împărățâia, împărățâiei, împărățîe, împărățîi, împărățîia, împărățîiei, înpărăție, înpărăției, înpărățiia, înpărățiie, înpărățiii, înpărățâe, înpărățâei, înpărățâia, înpărățâie.
The 3rd highest number of forms (30) was observed with the lemma “drept”: Derepțîi, dereapte, derep, derept, dereptul, dereptului, derepți, derepții, derepților, direapta, dirept, direptul, direptului, direpț, direpți, direpții, direpților, direpțîi, direpțîlor, dreapta, drept, drepte, dreptul, drepturi, drepturile, drepturilor, drepț, drepți, drepțî, dritul.
NOUN
occurs with 5 features: Case (96782; 100% instances), Definite (96782; 100% instances), Gender (96782; 100% instances), Number (96782; 100% instances), Degree (6; 0% instances)
NOUN
occurs with 10 feature-value pairs: Case=Acc,Nom
, Case=Dat,Gen
, Case=Voc
, Definite=Def
, Definite=Ind
, Degree=Pos
, Gender=Fem
, Gender=Masc
, Number=Plur
, Number=Sing
NOUN
occurs with 26 feature combinations.
The most frequent feature combination is Case=Acc,Nom|Definite=Ind|Gender=Fem|Number=Sing
(20856 tokens).
Examples: țară, oaste, lume, pace, parte, credință, vreme, casă, milă, gură
Relations
NOUN
nodes are attached to their parents using 30 different relations: obl (23185; 24% instances), nmod (15435; 16% instances), nsubj (15341; 16% instances), obj (14919; 15% instances), conj (8383; 9% instances), obl:pmod (3858; 4% instances), nmod:tmod (2927; 3% instances), appos (2326; 2% instances), root (2276; 2% instances), vocative (2030; 2% instances), iobj (1838; 2% instances), xcomp (1310; 1% instances), acl (563; 1% instances), advcl (467; 0% instances), parataxis (365; 0% instances), nsubj:pass (358; 0% instances), ccomp (324; 0% instances), obl:agent (316; 0% instances), compound (142; 0% instances), flat (134; 0% instances), csubj (82; 0% instances), orphan (81; 0% instances), case (48; 0% instances), advcl:tcl (30; 0% instances), fixed (18; 0% instances), amod (12; 0% instances), ccomp:pmod (11; 0% instances), nummod (2; 0% instances), advmod (1; 0% instances), discourse (1; 0% instances)
Parents of NOUN
nodes belong to 16 different parts of speech: VERB (62273; 64% instances), NOUN (21928; 23% instances), PROPN (4850; 5% instances), (2276; 2% instances), ADJ (1923; 2% instances), ADV (1465; 2% instances), PRON (1426; 1% instances), AUX (214; 0% instances), NUM (177; 0% instances), INTJ (132; 0% instances), DET (68; 0% instances), ADP (29; 0% instances), SCONJ (11; 0% instances), X (6; 0% instances), CCONJ (4; 0% instances), PUNCT (1; 0% instances)
22704 (23%) NOUN
nodes are leaves.
34218 (35%) NOUN
nodes have one child.
22319 (23%) NOUN
nodes have two children.
17542 (18%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 19.
Children of NOUN
nodes are attached using 44 different relations: case (40724; 28% instances), nmod (20649; 14% instances), punct (18538; 13% instances), det (16230; 11% instances), amod (8885; 6% instances), conj (8397; 6% instances), cc (7057; 5% instances), advmod (4316; 3% instances), acl (4171; 3% instances), cop (3793; 3% instances), nummod (3424; 2% instances), nsubj (2310; 2% instances), appos (1733; 1% instances), mark (1716; 1% instances), iobj (823; 1% instances), obl (776; 1% instances), advcl (504; 0% instances), aux (471; 0% instances), obl:pmod (194; 0% instances), parataxis (189; 0% instances), advmod:tmod (170; 0% instances), vocative (163; 0% instances), csubj (161; 0% instances), nmod:tmod (160; 0% instances), compound (114; 0% instances), flat (107; 0% instances), discourse (98; 0% instances), cc:preconj (80; 0% instances), obj (76; 0% instances), orphan (73; 0% instances), xcomp (57; 0% instances), expl (50; 0% instances), advcl:tcl (46; 0% instances), expl:pv (24; 0% instances), ccomp (15; 0% instances), aux:pass (12; 0% instances), nsubj:pass (7; 0% instances), obl:agent (7; 0% instances), expl:poss (3; 0% instances), ccomp:pmod (1; 0% instances), clf (1; 0% instances), dep (1; 0% instances), fixed (1; 0% instances), list (1; 0% instances)
Children of NOUN
nodes belong to 16 different parts of speech: ADP (41272; 28% instances), NOUN (21928; 15% instances), PUNCT (18538; 13% instances), DET (16268; 11% instances), ADJ (8015; 5% instances), CCONJ (7241; 5% instances), PRON (6613; 5% instances), VERB (6325; 4% instances), PROPN (6081; 4% instances), ADV (4729; 3% instances), AUX (4310; 3% instances), NUM (3582; 2% instances), SCONJ (977; 1% instances), PART (343; 0% instances), INTJ (101; 0% instances), X (5; 0% instances)