Statistics of NOUN in UD

home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

Treebank Statistics: UD_Tupinamba-TuDeT: POS Tags: `NOUN`

There are 554 NOUN lemmas (45%), 954 NOUN types (49%) and 1436 NOUN tokens (32%). Out of 14 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: _, iko, eko, aβa, jar, maʔe, apɨaβ, tupã, so, sɨ

The 10 most frequent NOUN types: aβa, janejara, maʔe, taβa, paʔi, Tupã, apɨaβa, cruz, seko, teko

The 10 most frequent ambiguous lemmas: _ (NOUN 87, VERB 34, PUNCT 12, ADP 9, PRON 9, PROPN 9, PART 6, ADV 5, NUM 2, DET 1, X 1), iko (NOUN 43, VERB 27, DET 5, ADV 1), aβa (NOUN 37, PRON 7), jar (NOUN 28, VERB 5, ADV 1), maʔe (NOUN 22, PRON 7, INTJ 1), so (VERB 32, NOUN 16), mojaŋ (NOUN 13, VERB 4), awsuβ (VERB 13, NOUN 12), ereko (NOUN 12, VERB 4), poʃɨ (NOUN 12, VERB 1)

The 10 most frequent ambiguous types: aβa (NOUN 20, PRON 2), maʔe (NOUN 13, PROPN 1), Tupã (PROPN 35, NOUN 12), kujã (NOUN 9, PART 1), São (NOUN 6, PROPN 1, PUNCT 1), marã (ADV 7, NOUN 5, PRON 2), Jesu (NOUN 2, PROPN 1), kotɨ (ADP 3, NOUN 2), pe (NOUN 2, PART 1), βeβe (NOUN 2, VERB 1)

aβa
- NOUN 20: kujãaŋaturama aβa βɨkawereʔɨma SantaMaria serɨβaʔe memɨ́ramo sekoreme
- PRON 2: Oso βepe amõ aβa aʔepeno ?
maʔe
- NOUN 13: maʔe ɨpɨruŋa jaβiʔõ , kwepe marãteko omoʔaŋekoaime
- PROPN 1: Ejori , maʔe nem , maʔe poʃɨ , mora , miaratakaka , seβoʔi , tamarutaka !
Tupã
- PROPN 35: Tupã momeʔu sɨkɨijeʔɨma , sauwsuβa rese seʔõuw
- NOUN 12: maʔepe imoŋaraiβɨpɨra jeroβjasaβete Tupã mojɨrõ potasaβamo ?
kujã
- NOUN 9: Oso kujã semimoʔeeta sapirõmo
- PART 1: Kotɨpe muru amoiŋe , kujã ajukaʔĩ sekoape .
São
- NOUN 6: Karaiβa nasetaj , São Sebastião aʔe omonɨk tata sese , imonɨja
- PROPN 1: Aikoβe nise sarõana , São Seβastião irũ , São Lourenço pɨtɨβõana .
- PUNCT 1: Aikoβe nise sarõana , São Seβastião irũ , São Lourenço pɨtɨβõana .
marã
- ADV 7: Aʔepe marã apɨ́aβeteramo sekow jane jaβe ?
- NOUN 5: Arakajate omorɨ ; ojojá marã sekow .
- PRON 2: Taʃepɨsɨrõ Tupã ʃesumarã swi , kwepe marã ʃerekoape , ojaβo
Jesu
- NOUN 2: Peporeawsu korine ; pemoɨrõ paʔi Jesu , ko taβa poβupoβu . perapɨ tataenɨne !
- PROPN 1: Pesawsu pemojaŋara , peimoete paʔi Jesu .
kotɨ
- ADP 3: Mokõj mona iʔekatwaβa kotɨ amõ , ae amõ iasu kotɨ
- NOUN 2: Aŋari , ajosuβ aβa kotɨ , taʃereroβjar wiʔiaβo . 0u tejẽ ʃepeʔaβo aβare ʔiaβa , kori , Tupã reko momewaβo .
pe
- NOUN 2: Ojtɨk , judeuseta cruz roβaβo pe rupi owataβaʔe aβe .
- PART 1: Marã ojkoβo pe teko aŋaipaβa ʔoki ?
βeβe
- NOUN 2: Karai βeβe serã , ko taβa rarõanete
- VERB 1: Pejaβiʔõ paʔi Tupã karai βeβe moiŋow .

Morphology

The form / lemma ratio of NOUN is 1.722022 (the average of all parts of speech is 1.577170).

The 1st highest number of forms (82) was observed with the lemma “_”: Nasawsuβarɨpɨramo, Naʃeremimotara, Perenosema, Takwara, apɨaβ, apɨaβaiβa, apɨaβaíβa, atɨraβeβo, aíβa, culpa, iatõjmɨreʔɨma, ijaʔo, ijukasarama, ikajemi, ikawĩwasuβaʔe, imoerapwanɨmɨra, imojarɨpɨ́ramo, imoperepereβawera, imoreʔɨmara, inupãsawera, ipira, ipotasape, ipoʃɨpwera, jemeʔeŋa, katupaβẽ, kɨreʔɨmβawera, manemwera, maramojaŋape, maraneʔɨma, maʔe, maʔeaiβa, mojaŋawama, monɨaβo, neratãŋatu, nererupa, nesawsuβa, nijɨpɨj, oarõanamo, oaʔo, oguβa, omara, omaramojãβaʔepwera, omoʔaŋekoaime, oreamotareʔɨmara, orewasemaβa, oreɨβɨjme, peposaŋa, pepɨsɨrõ, pepɨsɨrõawama, perekomojaŋaβa, perekorama, perekoreme, peremimojaŋa, peremimojaŋwama, pererekoaíme, pesemawama, porarasara, poreteramo, pɨtɨβõsara, rekoreme, rɨrɨja, sejtɨkite, sekow, soβajɨwaramo, sɨrɨki, tamanwa, tuisamojaŋape, tɨmawereʔɨma, upiarwera, uʔuβorwera, ɨsɨkatãsɨapwana, ɨβɨtu, ʃejeʔeŋa, ʃemomwerapane, ʃepapera, ʃepo, ʃerejtɨk, ʃereminuʔune, ʃererekow, ʃerorɨkatu, ʃerorɨβetene, ʃeruβisaβa.

The 2nd highest number of forms (19) was observed with the lemma “iko”: Ejmoiŋokatu, Sekote, janereko, moiŋow, nereko, oeko, ojkoβaʔe, pereko, perekopwera, perekoreme, reko, rekow, seko, sekoreme, sekow, serekow, teko, tekwara, ʃerekoape.

The 3rd highest number of forms (16) was observed with the lemma “eko”: Oeko, nereko, owekorama, pereko, reko, rekow, seko, sekopwera, sekoreme, sekow, serekow, teko, tekoara, tekopwera, ʃereko, ʃerekoape.

NOUN occurs with 33 features: Case (526; 37% instances), Rel (497; 35% instances), Number (209; 15% instances), Person (170; 12% instances), Nomzr (133; 9% instances), Person[psor] (120; 8% instances), NonFoc (92; 6% instances), Tense (80; 6% instances), Number[psor] (77; 5% instances), Reflex (61; 4% instances), Intens (49; 3% instances), Clusivity (43; 3% instances), Voice (41; 3% instances), Polarity (33; 2% instances), VerbForm (29; 2% instances), Mood (23; 2% instances), Int (19; 1% instances), Corf (18; 1% instances), Aspect (11; 1% instances), Degree (11; 1% instances), Priv (9; 1% instances), Hum (8; 1% instances), Red (6; 0% instances), Person[subj] (5; 0% instances), Dev (3; 0% instances), Emph (2; 0% instances), Number[subj] (2; 0% instances), Person[obj] (2; 0% instances), Animacy (1; 0% instances), Delib (1; 0% instances), Foreign (1; 0% instances), Poss (1; 0% instances), PronType (1; 0% instances)

NOUN occurs with 63 feature-value pairs: Animacy=Hum, Aspect=Iter, Aspect=Lus, Case=All, Case=Dat, Case=Loc, Case=Per, Case=Ref, Case=Tra, Case=Voc, Clusivity=Ex, Clusivity=In, Corf=Yes, Degree=Aug, Delib=Yes, Dev=Pass, Emph=Yes, Foreign=Yes, Hum=Yes, Int=Yes, Intens=Yes, Mood=Cnd, Mood=Irr, Mood=Per, Mood=Sub, Nomzr=Ag, Nomzr=CCirc, Nomzr=Circ, Nomzr=DevPass, Nomzr=Hab, Nomzr=Pas, Nomzr=Rel, NonFoc=Yes, Number=Plur, Number=Sing, Number[psor]=Plur, Number[psor]=Sing, Number[subj]=Sing, Person=1, Person=2, Person=3, Person[obj]=3, Person[psor]=1, Person[psor]=2, Person[psor]=3, Person[subj]=1, Polarity=Neg, Poss=Hum, Priv=Yes, PronType=Rcp, Red=Di, Reflex=Yes, Rel=Abs, Rel=Cont, Rel=Corf, Rel=Hum, Rel=NCont, Tense=Fut, Tense=Past, VerbForm=Ger, Voice=Cau, Voice=Mid, Voice=SCau

NOUN occurs with 324 feature combinations. The most frequent feature combination is _ (372 tokens). Examples: aβa, maʔe, paʔi, Tupã, cruz, judeus, kujã, muru, São, kawĩ

Relations

NOUN nodes are attached to their parents using 21 different relations: obl (279; 19% instances), nmod (253; 18% instances), root (223; 16% instances), obj (210; 15% instances), nsubj (111; 8% instances), conj (78; 5% instances), parataxis (73; 5% instances), appos (63; 4% instances), advcl (53; 4% instances), dep (25; 2% instances), xcomp (15; 1% instances), ccomp (12; 1% instances), acl (10; 1% instances), compound (10; 1% instances), vocative (7; 0% instances), discourse (5; 0% instances), nummod (3; 0% instances), amod (2; 0% instances), case (2; 0% instances), dislocated (1; 0% instances), iobj (1; 0% instances)

Parents of NOUN nodes belong to 11 different parts of speech: NOUN (625; 44% instances), VERB (508; 35% instances), (223; 16% instances), PROPN (55; 4% instances), PRON (12; 1% instances), ADP (4; 0% instances), ADV (4; 0% instances), NUM (2; 0% instances), DET (1; 0% instances), INTJ (1; 0% instances), PART (1; 0% instances)

575 (40%) NOUN nodes are leaves.

397 (28%) NOUN nodes have one child.

199 (14%) NOUN nodes have two children.

265 (18%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 12.

Children of NOUN nodes are attached using 24 different relations: punct (432; 23% instances), nmod (328; 18% instances), case (188; 10% instances), obl (157; 8% instances), nsubj (122; 7% instances), advmod (119; 6% instances), advcl (88; 5% instances), discourse (78; 4% instances), conj (67; 4% instances), parataxis (67; 4% instances), appos (52; 3% instances), obj (42; 2% instances), dep (33; 2% instances), det (22; 1% instances), compound (17; 1% instances), nummod (13; 1% instances), xcomp (12; 1% instances), cc (8; 0% instances), acl (6; 0% instances), mark (4; 0% instances), vocative (4; 0% instances), dislocated (3; 0% instances), amod (2; 0% instances), ccomp (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: NOUN (625; 34% instances), PUNCT (432; 23% instances), ADP (207; 11% instances), ADV (138; 7% instances), PRON (115; 6% instances), VERB (101; 5% instances), PART (85; 5% instances), PROPN (76; 4% instances), DET (59; 3% instances), NUM (10; 1% instances), INTJ (9; 0% instances), CCONJ (4; 0% instances), SCONJ (3; 0% instances), X (1; 0% instances)

Treebank Statistics: UD_Tupinamba-TuDeT: POS Tags: NOUN

Morphology

Relations

Treebank Statistics: UD_Tupinamba-TuDeT: POS Tags: `NOUN`