home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Portuguese-GSD: POS Tags: X

There are 24 X lemmas (0%), 160 X types (0%) and 403 X tokens (0%). Out of 16 observed tags, the rank of X is: 12 in number of lemmas, 8 in number of types and 16 in number of tokens.

The 10 most frequent X lemmas: _, dele, Y, art, avant, best, center, di, food, free

The 10 most frequent X types: disso, deles, delas, dele, do, +, etc, @, comigo, nele

The 10 most frequent ambiguous lemmas: _ (PROPN 26803, ADP 7821, PRON 6131, DET 3765, NOUN 3010, NUM 2377, AUX 1984, CCONJ 1516, PUNCT 1272, VERB 1077, SYM 904, ADJ 597, PART 561, X 379, ADV 191, SCONJ 3), art (NOUN 1, X 1), di (PROPN 1, X 1), of (PROPN 15, ADP 2, X 1), off (NOUN 1, X 1), spin (NOUN 1, X 1)

The 10 most frequent ambiguous types: disso (X 58, ADP 2), dele (X 16, PRON 2), + (X 10, PUNCT 8, PROPN 2), etc (X 10, ADV 1), @ (X 9, PROPN 1, PUNCT 1), comigo (X 9, NOUN 1), no (X 7, PRON 2), pelo (X 7, NOUN 1), desse (ADP 30, X 3, VERB 1), desses (ADP 15, X 4)

Morphology

The form / lemma ratio of X is 6.666667 (the average of all parts of speech is 2.236183).

The 1st highest number of forms (139) was observed with the lemma “_”: #, &, +, @, Amazon.com, Destes, Flyscoot.com, GameSpot.com, Neles, OBS., T, UltimoInstante, a, a_0, a_1, a_2, a_3, a_i, a_n, amaralcarvalho.org.br, ao, aos, art, atributos.Alucard, b, cdots, comigo, conosco, consigo, contato@cinedireitoshumanos.org.br, contigo, cpae@unesc.net, d, da, daquelas, daquele, daqueles, das, dela, delas, dele, deles, denunciapropaganda@tre-rj.jus.br, dessa, dessas, desse, desses, desta, destas, deste, disso, disto, do, dos, durvalorlato, e, eletrônicowww.cespe.unb.br/concursos/pc_al_12, etc, ex, fake, g1.globo.com/economia, g1.globo.com/ma, g1.globo.com/para, g1.globo.com/piaui, g1.globo.com/politica, g1.globo.com/ribeirao, g1.globo.com/vanguarda, gmail.com, http://m.goal.com, http://t.co/HmrlNAqd, http://www.cmgww.com/stars/baker/about/biography.html, http://www.portal-gestao.com/financas/folhas-de-calculo.html, i, k, m, n, na, naquilo, nela, nelas, nele, nesta, neste, nisso, no, num, o, offs, ouvidoria@imepi.pi.gov.br, p, p.e., pelo, planeta1@sercomtel.com.br, play, poupatemposp, prev, que, r., simone.bavaroski, sum_, up, usopera.com, v1, v2, vm, www.anac.gov.br., www.barracaodosamba.com, www.centropaulasouza.sp.gov.br, www.cotec.unimontes.br, www.detran.rj.gov.br, www.edraaeronautica.com.br, www.goobec.com.br, www.informalcool.org.br, www.ingresso.com, www.ipem.rj.gov.br., www.planetaeducacao.com.br, www.planexcon.com.br, www.receita.fazenda.gov.br, www.saocaetanodosul.sp.gov.br, www.submarino.com.br, www.timedoemprego.sp.gov.br, www.universa.org.br, www.valeviagemcvc.com.br, www.vestibulinhoetec.com.br, x, à, àquela, àqueles, às, λ1, λ2, λm, الاذكار, مشرق, ☎, 天台, 日, 禅, 莲.

The 2nd highest number of forms (1) was observed with the lemma “Y”: \epsilon=\epsilon_{0}.

The 3rd highest number of forms (1) was observed with the lemma “art”: art.

X occurs with 3 features: Gender (6; 1% instances), Number (6; 1% instances), ExtPos (4; 1% instances)

X occurs with 4 feature-value pairs: ExtPos=NOUN, Gender=Fem, Gender=Masc, Number=Sing

X occurs with 5 feature combinations. The most frequent feature combination is _ (394 tokens). Examples: disso, deles, delas, dele, do, +, etc, @, comigo, nele

Relations

X nodes are attached to their parents using 19 different relations: nmod (230; 57% instances), appos (35; 9% instances), conj (27; 7% instances), fixed (25; 6% instances), flat (23; 6% instances), case (10; 2% instances), parataxis (9; 2% instances), dep (8; 2% instances), flat:foreign (8; 2% instances), cc (6; 1% instances), obj (4; 1% instances), root (4; 1% instances), amod (3; 1% instances), nsubj (3; 1% instances), iobj (2; 0% instances), mark (2; 0% instances), obl (2; 0% instances), ccomp (1; 0% instances), nsubj:pass (1; 0% instances)

Parents of X nodes belong to 11 different parts of speech: NOUN (113; 28% instances), VERB (101; 25% instances), ADV (59; 15% instances), X (41; 10% instances), PRON (33; 8% instances), ADJ (20; 5% instances), PROPN (18; 4% instances), NUM (7; 2% instances), ADP (4; 1% instances), (4; 1% instances), SYM (3; 1% instances)

253 (63%) X nodes are leaves.

93 (23%) X nodes have one child.

30 (7%) X nodes have two children.

27 (7%) X nodes have three or more children.

The highest child degree of a X node is 38.

Children of X nodes are attached using 22 different relations: punct (91; 32% instances), acl:relcl (46; 16% instances), flat (30; 11% instances), case (26; 9% instances), nmod (15; 5% instances), conj (14; 5% instances), det (14; 5% instances), flat:foreign (8; 3% instances), cc (7; 2% instances), appos (5; 2% instances), acl (4; 1% instances), amod (4; 1% instances), advmod (3; 1% instances), cop (3; 1% instances), nsubj (3; 1% instances), nummod (3; 1% instances), dep (1; 0% instances), det:poss (1; 0% instances), flat:name (1; 0% instances), mark (1; 0% instances), parataxis (1; 0% instances), xcomp (1; 0% instances)

Children of X nodes belong to 12 different parts of speech: PUNCT (91; 32% instances), VERB (52; 18% instances), X (41; 15% instances), ADP (24; 9% instances), NOUN (20; 7% instances), DET (16; 6% instances), PROPN (13; 5% instances), CCONJ (8; 3% instances), NUM (6; 2% instances), ADJ (4; 1% instances), ADV (4; 1% instances), AUX (3; 1% instances)