home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Portuguese-PetroGold: POS Tags: X

There are 102 X lemmas (1%), 115 X types (1%) and 216 X tokens (0%). Out of 16 observed tags, the rank of X is: 7 in number of lemmas, 7 in number of types and 15 in number of tokens.

The 10 most frequent X lemmas: _, in, drill, n, flow, core, booster, pin, situ, stripe

The 10 most frequent X types: in, drill, n, flow, core, ., booster, pin, situ, stripe

The 10 most frequent ambiguous lemmas: in (X 13, PROPN 1), n (NOUN 16, X 10), flow (X 9, NOUN 3, PROPN 1), core (X 6, NOUN 3), . (PUNCT 7995, PROPN 34, X 3), al.( (NOUN 3, X 2), and (PROPN 11, X 2), com.br (NOUN 3, X 2), playa (NOUN 10, X 2), óleo (NOUN 1094, X 2)

The 10 most frequent ambiguous types: in (X 13, PROPN 1), n (NOUN 13, X 10), flow (X 9, NOUN 3, PROPN 1), core (X 6, NOUN 3), . (PUNCT 7995, PROPN 34, X 5), / (PUNCT 39, ADP 11, PROPN 5, X 2), al.( (NOUN 3, X 2), and (PROPN 11, X 2), cima (NOUN 6, X 2), com.br (NOUN 3, X 2)

Morphology

The form / lemma ratio of X is 1.127451 (the average of all parts of speech is 1.452143).

The 1st highest number of forms (15) was observed with the lemma “_”: ., /, cima, com.br/pt/, eará, escala, estrategia/plano-de-negocios-e-gestao, eórico, org, quem-somos, químicas, wikimedia, ão, ões, ’s.

The 2nd highest number of forms (1) was observed with the lemma “.”: ..

The 3rd highest number of forms (1) was observed with the lemma “adsorption”: adsorption.

X occurs with 3 features: Foreign (150; 69% instances), Gender (1; 0% instances), Number (1; 0% instances)

X occurs with 3 feature-value pairs: Foreign=Yes, Gender=Masc, Number=Plur

X occurs with 3 feature combinations. The most frequent feature combination is Foreign=Yes (150 tokens). Examples: drill, n, in, flow, booster, situ, core, station, balling, bit

Relations

X nodes are attached to their parents using 15 different relations: flat:foreign (82; 38% instances), nmod (73; 34% instances), appos (18; 8% instances), flat (15; 7% instances), goeswith (9; 4% instances), flat:name (4; 2% instances), conj (3; 1% instances), parataxis (3; 1% instances), amod (2; 1% instances), obj (2; 1% instances), nsubj (1; 0% instances), nsubj:pass (1; 0% instances), obl (1; 0% instances), obl:agent (1; 0% instances), obl:arg (1; 0% instances)

Parents of X nodes belong to 6 different parts of speech: NOUN (102; 47% instances), X (89; 41% instances), PROPN (10; 5% instances), VERB (9; 4% instances), ADJ (4; 2% instances), ADV (2; 1% instances)

138 (64%) X nodes are leaves.

30 (14%) X nodes have one child.

11 (5%) X nodes have two children.

37 (17%) X nodes have three or more children.

The highest child degree of a X node is 6.

Children of X nodes are attached using 13 different relations: flat:foreign (82; 45% instances), punct (47; 26% instances), case (15; 8% instances), nmod (14; 8% instances), det (6; 3% instances), conj (4; 2% instances), flat (4; 2% instances), amod (3; 2% instances), appos (2; 1% instances), cc (2; 1% instances), acl:relcl (1; 1% instances), advmod (1; 1% instances), nummod (1; 1% instances)

Children of X nodes belong to 10 different parts of speech: X (89; 49% instances), PUNCT (47; 26% instances), ADP (15; 8% instances), NOUN (12; 7% instances), DET (6; 3% instances), PROPN (6; 3% instances), ADJ (3; 2% instances), CCONJ (2; 1% instances), ADV (1; 1% instances), NUM (1; 1% instances)