Treebank Statistics: UD_Tupinamba-TuDeT: POS Tags: NOUN
There are 554 NOUN
lemmas (45%), 954 NOUN
types (49%) and 1436 NOUN
tokens (32%).
Out of 14 observed tags, the rank of NOUN
is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: _, iko, eko, aβa, jar, maʔe, apɨaβ, tupã, so, sɨ
The 10 most frequent NOUN
types: aβa, janejara, maʔe, taβa, paʔi, Tupã, apɨaβa, cruz, seko, teko
The 10 most frequent ambiguous lemmas: _ (NOUN 87, VERB 34, PUNCT 12, ADP 9, PRON 9, PROPN 9, PART 6, ADV 5, NUM 2, DET 1, X 1), iko (NOUN 43, VERB 27, DET 5, ADV 1), aβa (NOUN 37, PRON 7), jar (NOUN 28, VERB 5, ADV 1), maʔe (NOUN 22, PRON 7, INTJ 1), so (VERB 32, NOUN 16), mojaŋ (NOUN 13, VERB 4), awsuβ (VERB 13, NOUN 12), ereko (NOUN 12, VERB 4), poʃɨ (NOUN 12, VERB 1)
The 10 most frequent ambiguous types: aβa (NOUN 20, PRON 2), maʔe (NOUN 13, PROPN 1), Tupã (PROPN 35, NOUN 12), kujã (NOUN 9, PART 1), São (NOUN 6, PROPN 1, PUNCT 1), marã (ADV 7, NOUN 5, PRON 2), Jesu (NOUN 2, PROPN 1), kotɨ (ADP 3, NOUN 2), pe (NOUN 2, PART 1), βeβe (NOUN 2, VERB 1)
- aβa
- maʔe
- Tupã
- kujã
- São
- marã
- Jesu
- kotɨ
- pe
- βeβe
Morphology
The form / lemma ratio of NOUN
is 1.722022 (the average of all parts of speech is 1.577170).
The 1st highest number of forms (82) was observed with the lemma “_”: Nasawsuβarɨpɨramo, Naʃeremimotara, Perenosema, Takwara, apɨaβ, apɨaβaiβa, apɨaβaíβa, atɨraβeβo, aíβa, culpa, iatõjmɨreʔɨma, ijaʔo, ijukasarama, ikajemi, ikawĩwasuβaʔe, imoerapwanɨmɨra, imojarɨpɨ́ramo, imoperepereβawera, imoreʔɨmara, inupãsawera, ipira, ipotasape, ipoʃɨpwera, jemeʔeŋa, katupaβẽ, kɨreʔɨmβawera, manemwera, maramojaŋape, maraneʔɨma, maʔe, maʔeaiβa, mojaŋawama, monɨaβo, neratãŋatu, nererupa, nesawsuβa, nijɨpɨj, oarõanamo, oaʔo, oguβa, omara, omaramojãβaʔepwera, omoʔaŋekoaime, oreamotareʔɨmara, orewasemaβa, oreɨβɨjme, peposaŋa, pepɨsɨrõ, pepɨsɨrõawama, perekomojaŋaβa, perekorama, perekoreme, peremimojaŋa, peremimojaŋwama, pererekoaíme, pesemawama, porarasara, poreteramo, pɨtɨβõsara, rekoreme, rɨrɨja, sejtɨkite, sekow, soβajɨwaramo, sɨrɨki, tamanwa, tuisamojaŋape, tɨmawereʔɨma, upiarwera, uʔuβorwera, ɨsɨkatãsɨapwana, ɨβɨtu, ʃejeʔeŋa, ʃemomwerapane, ʃepapera, ʃepo, ʃerejtɨk, ʃereminuʔune, ʃererekow, ʃerorɨkatu, ʃerorɨβetene, ʃeruβisaβa.
The 2nd highest number of forms (19) was observed with the lemma “iko”: Ejmoiŋokatu, Sekote, janereko, moiŋow, nereko, oeko, ojkoβaʔe, pereko, perekopwera, perekoreme, reko, rekow, seko, sekoreme, sekow, serekow, teko, tekwara, ʃerekoape.
The 3rd highest number of forms (16) was observed with the lemma “eko”: Oeko, nereko, owekorama, pereko, reko, rekow, seko, sekopwera, sekoreme, sekow, serekow, teko, tekoara, tekopwera, ʃereko, ʃerekoape.
NOUN
occurs with 33 features: Case (526; 37% instances), Rel (497; 35% instances), Number (209; 15% instances), Person (170; 12% instances), Nomzr (133; 9% instances), Person[psor] (120; 8% instances), NonFoc (92; 6% instances), Tense (80; 6% instances), Number[psor] (77; 5% instances), Reflex (61; 4% instances), Intens (49; 3% instances), Clusivity (43; 3% instances), Voice (41; 3% instances), Polarity (33; 2% instances), VerbForm (29; 2% instances), Mood (23; 2% instances), Int (19; 1% instances), Corf (18; 1% instances), Aspect (11; 1% instances), Degree (11; 1% instances), Priv (9; 1% instances), Hum (8; 1% instances), Red (6; 0% instances), Person[subj] (5; 0% instances), Dev (3; 0% instances), Emph (2; 0% instances), Number[subj] (2; 0% instances), Person[obj] (2; 0% instances), Animacy (1; 0% instances), Delib (1; 0% instances), Foreign (1; 0% instances), Poss (1; 0% instances), PronType (1; 0% instances)
NOUN
occurs with 63 feature-value pairs: Animacy=Hum
, Aspect=Iter
, Aspect=Lus
, Case=All
, Case=Dat
, Case=Loc
, Case=Per
, Case=Ref
, Case=Tra
, Case=Voc
, Clusivity=Ex
, Clusivity=In
, Corf=Yes
, Degree=Aug
, Delib=Yes
, Dev=Pass
, Emph=Yes
, Foreign=Yes
, Hum=Yes
, Int=Yes
, Intens=Yes
, Mood=Cnd
, Mood=Irr
, Mood=Per
, Mood=Sub
, Nomzr=Ag
, Nomzr=CCirc
, Nomzr=Circ
, Nomzr=DevPass
, Nomzr=Hab
, Nomzr=Pas
, Nomzr=Rel
, NonFoc=Yes
, Number=Plur
, Number=Sing
, Number[psor]=Plur
, Number[psor]=Sing
, Number[subj]=Sing
, Person=1
, Person=2
, Person=3
, Person[obj]=3
, Person[psor]=1
, Person[psor]=2
, Person[psor]=3
, Person[subj]=1
, Polarity=Neg
, Poss=Hum
, Priv=Yes
, PronType=Rcp
, Red=Di
, Reflex=Yes
, Rel=Abs
, Rel=Cont
, Rel=Corf
, Rel=Hum
, Rel=NCont
, Tense=Fut
, Tense=Past
, VerbForm=Ger
, Voice=Cau
, Voice=Mid
, Voice=SCau
NOUN
occurs with 324 feature combinations.
The most frequent feature combination is _
(372 tokens).
Examples: aβa, maʔe, paʔi, Tupã, cruz, judeus, kujã, muru, São, kawĩ
Relations
NOUN
nodes are attached to their parents using 21 different relations: obl (279; 19% instances), nmod (253; 18% instances), root (223; 16% instances), obj (210; 15% instances), nsubj (111; 8% instances), conj (78; 5% instances), parataxis (73; 5% instances), appos (63; 4% instances), advcl (53; 4% instances), dep (25; 2% instances), xcomp (15; 1% instances), ccomp (12; 1% instances), acl (10; 1% instances), compound (10; 1% instances), vocative (7; 0% instances), discourse (5; 0% instances), nummod (3; 0% instances), amod (2; 0% instances), case (2; 0% instances), dislocated (1; 0% instances), iobj (1; 0% instances)
Parents of NOUN
nodes belong to 11 different parts of speech: NOUN (625; 44% instances), VERB (508; 35% instances), (223; 16% instances), PROPN (55; 4% instances), PRON (12; 1% instances), ADP (4; 0% instances), ADV (4; 0% instances), NUM (2; 0% instances), DET (1; 0% instances), INTJ (1; 0% instances), PART (1; 0% instances)
575 (40%) NOUN
nodes are leaves.
397 (28%) NOUN
nodes have one child.
199 (14%) NOUN
nodes have two children.
265 (18%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 12.
Children of NOUN
nodes are attached using 24 different relations: punct (432; 23% instances), nmod (328; 18% instances), case (188; 10% instances), obl (157; 8% instances), nsubj (122; 7% instances), advmod (119; 6% instances), advcl (88; 5% instances), discourse (78; 4% instances), conj (67; 4% instances), parataxis (67; 4% instances), appos (52; 3% instances), obj (42; 2% instances), dep (33; 2% instances), det (22; 1% instances), compound (17; 1% instances), nummod (13; 1% instances), xcomp (12; 1% instances), cc (8; 0% instances), acl (6; 0% instances), mark (4; 0% instances), vocative (4; 0% instances), dislocated (3; 0% instances), amod (2; 0% instances), ccomp (1; 0% instances)
Children of NOUN
nodes belong to 14 different parts of speech: NOUN (625; 34% instances), PUNCT (432; 23% instances), ADP (207; 11% instances), ADV (138; 7% instances), PRON (115; 6% instances), VERB (101; 5% instances), PART (85; 5% instances), PROPN (76; 4% instances), DET (59; 3% instances), NUM (10; 1% instances), INTJ (9; 0% instances), CCONJ (4; 0% instances), SCONJ (3; 0% instances), X (1; 0% instances)