home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Romanian-RRT: POS Tags: X

There are 74 X lemmas (0%), 115 X types (0%) and 161 X tokens (0%). Out of 16 observed tags, the rank of X is: 7 in number of lemmas, 9 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: _, 5a, American, alia, in, inter, metri_pătrați, -a, ACTIVE, Awards

The 10 most frequent X types: 000, 500, 100, mp, 0, 2, 5a, American, K., alia

The 10 most frequent ambiguous lemmas: _ (X 82, NUM 2, PUNCT 1), 5a (ADV 3, X 2, NUM 1, PROPN 1), in (ADP 23, NOUN 1, X 1), -a (DET 23, X 1), Awards (PROPN 1, X 1), Book (PROPN 1, X 1), Klebsiella (PROPN 1, X 1), New (PROPN 7, X 1), al (DET 2845, X 1), car (NOUN 3, X 1)

The 10 most frequent ambiguous types: 000 (X 30, NUM 1), 500 (NUM 8, X 4), 100 (NUM 22, X 3), mp (NOUN 4, X 3), 0 (NUM 22, X 2), 2 (NUM 281, X 2), 5a (ADV 3, X 2, NUM 1, PROPN 1), dată (NOUN 76, VERB 6, X 2, ADJ 1), in (ADP 18, NOUN 1, X 1), un (DET 1610, NUM 16, X 2)

Morphology

The form / lemma ratio of X is 1.554054 (the average of all parts of speech is 1.814756).

The 1st highest number of forms (44) was observed with the lemma “_”: -apune, 0, 000, 065, 100, 112, 2, 230, 2C9, 3, 307, 390, 391, 3A4, 400, 463, 500, 672, 720, 736, 770, 867, 898, 9, 900, 914, 957, 996, Dopa, G-CSF, VAMA, alpine, amiezei, dată, dopei, glicozidice, glicozidică, grabă, operativă, retinoizi, spre, un, una, zicochimice.

The 2nd highest number of forms (1) was observed with the lemma “-a”: -a.

The 3rd highest number of forms (1) was observed with the lemma “5a”: 5a.

X occurs with 2 features: Foreign (31; 19% instances), Abbr (9; 6% instances)

X occurs with 2 feature-value pairs: Abbr=Yes, Foreign=Yes

X occurs with 3 feature combinations. The most frequent feature combination is _ (121 tokens). Examples: 000, 500, 100, 0, 2, American, dată, un, -a, -apune

Relations

X nodes are attached to their parents using 13 different relations: goeswith (82; 51% instances), flat (37; 23% instances), nmod (14; 9% instances), conj (8; 5% instances), appos (6; 4% instances), amod (3; 2% instances), dep (3; 2% instances), fixed (2; 1% instances), nsubj (2; 1% instances), case (1; 1% instances), obj (1; 1% instances), obl (1; 1% instances), root (1; 1% instances)

Parents of X nodes belong to 11 different parts of speech: NUM (63; 39% instances), X (33; 20% instances), NOUN (31; 19% instances), PROPN (13; 8% instances), ADJ (7; 4% instances), ADV (5; 3% instances), VERB (4; 2% instances), DET (2; 1% instances), ADP (1; 1% instances), PRON (1; 1% instances), (1; 1% instances)

124 (77%) X nodes are leaves.

10 (6%) X nodes have one child.

14 (9%) X nodes have two children.

13 (8%) X nodes have three or more children.

The highest child degree of a X node is 6.

Children of X nodes are attached using 12 different relations: flat (28; 32% instances), punct (27; 31% instances), case (7; 8% instances), conj (5; 6% instances), nummod (5; 6% instances), det (4; 5% instances), cc (3; 3% instances), amod (2; 2% instances), fixed (2; 2% instances), nmod (2; 2% instances), advmod (1; 1% instances), appos (1; 1% instances)

Children of X nodes belong to 10 different parts of speech: X (33; 38% instances), PUNCT (27; 31% instances), ADP (6; 7% instances), NUM (5; 6% instances), DET (4; 5% instances), PROPN (4; 5% instances), CCONJ (3; 3% instances), ADJ (2; 2% instances), NOUN (2; 2% instances), ADV (1; 1% instances)