UD Highland Puebla Nahuatl ITML
Language: Highland Puebla Nahuatl (code: azz
)
Family: Uto-Aztecan
This treebank has been part of Universal Dependencies since the UD v2.13 release.
The following people have contributed to making this treebank part of UD: Robert Pugh, Francis Tyers.
Repository: UD_Highland_Puebla_Nahuatl-ITML
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: spoken, grammar-examples, nonfiction
Questions, comments? General annotation questions (either Highland Puebla Nahuatl-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [pughrob (æt) iu • edu]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
UD_Highland_Puebla_Nahuatl-ITML is a collection of texts in the Highland Puebla variety of Nahuatl (ISO-639: azz
) spoken in 24 municipalities
in the state of Mexico in Puebla. The treebank contains spoken monologue and dialogue, scientific texts translated from Spanish and some miscellaneous
grammatical examples from a language course.
…
Acknowledgments
We would like to thank the following for giving permission to use their sentences.
- Jonathan Amith
- Pedro Rivera Gómez
- Patricia Aguilar Romero
- Sociedad Méxicana de Física
And we would like to thank the following for assistance in the annotation and validation process:
- Mitsuya Sasaki
References
- (citation)
Statistics of UD Highland Puebla Nahuatl ITML
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Animacy[obj] – Aspect – Case – Degree – Gender – Mood – Movement – NounType – Number – Number[obj] – Number[psor] – Number[subj] – Person – Person[obj] – Person[psor] – Person[subj] – PronType – Subcat – Tense – Typo – VerbForm
Relations
acl – acl:relcl – advcl – advmod – amod – appos – aux – case – cc – ccomp – compound – conj – cop – csubj – dep – det – discourse – dislocated – fixed – flat – goeswith – iobj – mark – nmod – nsubj – nummod – obj – obl – orphan – parataxis – punct – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 1260 sentences, 10018 tokens and 10088 syntactic words.
- This corpus contains 2061 tokens (21%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 59 types of words that contain both letters and punctuation. Examples: n', ki..., i..., s..., n'mati, nik..., 'technohnotsa, k..., kow..., mo..., n..., no..., o..., 'Panketsah, 'Tasohkamatilia, 'kixtia, 'kwechowah, 'nechili, Kwa..., Nox..., Pan..., Tiki..., chik..., cuad., die..., e.., es..., ix..., ke..., kiste..., lima-limón, moch..., nano-robot, nano-robots, nas..., niki..., nim..., nimo..., oh..., pahpata..., seh..., t'nekih, t'tokaytia, t..., tah..., te..., teh..., ti..., tik..., tikch...
- This corpus contains 70 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
- There are 45 types of multi-word tokens. Examples: kaltsintan, kuoujijtik, Tonalixkopa, Kalikampa, imaikan, imatampa, kajfentenoj, koyokopa, kuouijtik, ojtenoj, talixko, xolalpan, Atmolonkopa, Atpoliuikopa, Tatampakopa, Xaltepekopa, Xaltipampa, aten, atentenoh, chilartenoj, ehekaixko, inixiujyo, kaltenoj, kaltsinta, kaltsintaj, kanitehwatsin, kijtosneki, kikuasneki, kitokasneki, kuoujtajpa, maajsi, majase, maseualkopa, mawiltihtinemih, miltenoj, mochiujtiuetskej, nikahsikamatisnekia, nikiitaseki, okse, semiuejueyi, talijtik, tamatisneki, tamixochiyoua, tatampa, tepekespaj.
Morphology
Tags
- This corpus uses 15 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: PART, SYM
- This corpus contains 33 lemmas tagged as pronouns (PRON): akaj, akin, eso, in, ini, kachi, katin, ke, miak, namehuan, ne, nehua, nejin, nejon, nen, nijin, nijon, nochi, okse, que, qué, se, seki, tehua, tehuan, tei, tein, teisa, teyi, toni, yehua, yehuan, yon
- This corpus contains 23 lemmas tagged as determiners (DET): cada, el, in, kachi, kada, kanachi, katin, la, miak, ne, neje, nejin, nejon, nen, nijin, nijon, nochi, okse, se, sejse, seki, teisa, yon
- Out of the above, 16 lemmas occurred sometimes as PRON and sometimes as DET: in, kachi, katin, miak, ne, nejin, nejon, nen, nijin, nijon, nochi, okse, se, seki, teisa, yon
- This corpus contains 6 lemmas tagged as auxiliaries (AUX): etok, hueli, huetsi, ma, neki, nemi
- Out of the above, 4 lemmas occurred sometimes as AUX and sometimes as VERB: hueli, huetsi, neki, nemi
- There are 1 (de)verbal forms:
- Fin
- VERB: mochiua, onkak, kikua, xochiyoua, monamaka, kualtia, kitoka, mochiwa, ixua, monotsa
Nominal Features
- Fem
- ADJ: Autónoma, Láctea, cancerígenas, cimarrón, electromagnética, fresca
- DET: la
- NOUN: células, microondas, moléculas, nanotecnologías, cimarrón, manila, manteca, radiación, radiografías, Antropología
- PROPN: Maria, niMaria, Amelia, Coral, Cristina, Elisa, Elvira, Emma, Evangelista, Guadalupe
- Fem,Masc
- ADJ: Ambientales, Austral, Espacial, Norte, celulares, corriente, fuerte, nucleares, útiles
- Masc
- ADJ: morado, moradito, fresco, injertado, mismo, contento, moraditos, poco
- NOUN: marzo, abril, compadrito, mayo, agosto, junio, átomos, enero, febrero, rayos
- PROPN: Miguel, Anastacio, Nicolas, Eleuterio, Ruben, Alfredo, Damian, Ernesto, Gustavo, Jesus
- Plur
- ADJ: tsikitsitsin, cancerígenas, celulares, matsikitsitsin, moraditos, nucleares, wehweitsitsin, útiles
- DET: oksekin, miyakej, okseki
- NOUN: sitalimej, teposmej, átomos, okwilimeh, pilimej, rayos, taltikpakmej, células, chiktejmej, microondas
- PRON: tehwan, oksekin, Okseki, tejuan, Sekimej, miakeh, nejinkej, Yehwa, namehwan, teh
- Sing
- ADJ: morado, moradito, fresco, injertado, mismo, Ambientales, Austral, Autónoma, Espacial, Láctea
- DET: la
- NOUN: marzo, abril, compadrito, mayo, agosto, junio, enero, febrero, septiembre, criollo
- PRON: neh, yehwa, yeh, yejua, tehwatsin, Yehua, ne, nehwa, teh, yej
- PROPN: Felipe, Pedro, Tetela
- Abs
- NOUN: taman, kajfentaj, kuoujtaj, kuouit, pajti, xiuit, milaj, tomakilit, at, pahti
- PRON: tehwan, tejuan, Yehwa, namehwan
Degree and Polarity
- Dim
- ADJ: kwaltsin, tsikitsitsin, kualtsin, tsiktsin, tsikitsin, potoxtsin, matsikitsitsin, tsikilitsin, wehweitsitsin
- ADV: Ijkontsin, ihkontsin
- NOUN: iteyotsin, nanakatsin, ixochiotsin, xiwtsin, atsin, kontsin, tapialtsin, xochitsin, ichantsin, ikowyotsin
- NUM: setsin
- PRON: tehwatsin
Verbal Features
- Imp
- AUX: nekia
- VERB-Fin: nikmatia, Tikihtowaya, kitaliayah, Tikmatiya, ijkaya, itsmolinia, kikixtiliayah, kikuayaj, kikwayah, kikwiah
- Perf
- VERB-Fin: onkak, nikitak, nikitani, oksik, peuak, techpaleuij, tekokoj, Kinamakaj, TEKITIKEJ, kiijkuilojkej
- Prog
- AUX: tiyetoskiyaj
- VERB-Fin: nikixmattok, nikmattok, chijchiujtokej, xochiyojtok, kinetechojtok, nikitstok, tikixmattok, timonohnotstokeh, Ijkatok, chichiujtok
- Cnd
- AUX: eski, eskia, tiyetoskiyaj
- VERB-Fin: Niknekiskia, timotajtanijtoskiyaj
- Imp
- VERB-Fin: xtechtapowi, 'nechili, 'technohnotsa, xinechtapowi, xnechili, xnechtapowi
- Ind
- VERB-Fin: mochiua, onkak, kikua, xochiyoua, monamaka, kualtia, kitoka, mochiwa, ixua, monotsa
- Opt
- AUX: yetokan
- VERB-Fin: tikixnextikan, kiixmatikan, kipepechokan, mochiuakan, tikihtokan, tikinixmatikan
- Prp
- VERB-Fin: Chikawayati, kichkuati, kikuitij, kikwitih, motamiti, nanmontiotaki, nimoskalti, oksitiw, onkati, tamiti
- Fut
- AUX: uelis, eski
- VERB-Fin: tayis, kisas, onkas, kikuas, kinemilis, tikmatis, timitstahtaniskeh, Nimitspalehuis, chijchiujtoskej, kiajsikamatis
- Past
- VERB-Fin: onkak, nikitak, nikitani, oksik, peuak, techpaleuij, tekokoj, Kinamakaj, TEKITIKEJ, kiijkuilojkej
- Pqp
- VERB-Fin: kipijpixka
- Pres
- AUX: uelij
- VERB-Fin: mochiua, kikua, xochiyoua, monamaka, kualtia, kitoka, mochiwa, ixua, monotsa, kitokaj
Pronouns, Determiners, Quantifiers
- Prs
- PRON: neh, yehwa, tehwan, yeh, yejua, teh, tehwatsin, tejuan, Yehua, ne
- Rel
- ADV: kampa, kanpa
- 1
- PRON: neh, tehwan, tejuan, ne, nehwa, nejua, teh
- 2
- PRON: tehwatsin, teh, namehwan, tejua
- 3
- PRON: yehwa, yeh, yejua, Yehua, yej
- Plur
- NOUN: totaltikpak, tojomiuan, totalmanik, Tosemanauak, Totakayo, intokayuan, toatsin, todios, togalaxia, toixtololouan
- Sing
- NOUN: itech, ika, iteyo, itakka, iuan, ichokilo, iteyotsin, ixochio, ixochiyo, itakilo
Other Features
- Animacy[obj]
- Hum
- VERB-Fin: tekokoj, tekwi, tenamiki, Nikochisneki, Nimitspalehuia, kitemaka, tekekeloua, tekixtiliah, tekui, tepahtia
- Nhum
- VERB-Fin: tapepechowah, takua, Kitatemolia, kitachipauis, tachijchiua, takauani, tamatemouaj, tamatis, tameuj, tamimilmeuaj
- Hum
- Movement
- Ven
- VERB-Fin: Kiwalkwih, ualeua, kiualkui, kwalkwiah, mokuapiki
- Ven
- NounType
- Relat
- NOUN: itech, ika, tsintan, kopa, iuan, ijtik, tenoj, ixko, pa, tampa
- Relat
- Number[obj]
- Plur
- VERB-Fin: 'technohnotsa, techpaleuij, xtechtapowi, kiki, kinamaka, kinchijchiua, kininmaka, kininpoxtalilia, kinitas, kinixmatis
- Sing
- VERB-Fin: kikua, monamaka, kitoka, monotsa, kitokaj, kitokaytiaj, nimonotsa, kikwa, kipiya, kiluiaj
- Plur
- Number[subj]
- Plur
- ADJ: Tsopek, Xohxoxoktik, Yemanik, teiktik, wehwei
- AUX: tiyetoskiyaj, uelij, yetokan
- NOUN: bolitas, kokonemej, xiuit
- PRON: tisekimeh
- VERB-Fin: kitokaj, kitokaytiaj, kiluiaj, kikwah, kijtouaj, kiliah, kikwih, kichiwah, kikuaj, tikiliah
- Sing
- ADJ: sesek, istak, totonik, welik, uelik, kwali, tsikitsitsin, chichiltik, pisiltik, tsopek
- ADV: neli, tiotak
- AUX: eski, yetok, etok, eskia
- NOUN: pahti, nanakat, xiuit, kuouit, pajti, xiwit, chilaj, cimarrón, criollo, etaj
- PROPN: niMaria, niErnesto, niRuben
- VERB-Fin: mochiua, onkak, kikua, xochiyoua, monamaka, kualtia, kitoka, mochiwa, ixua, monotsa
- Plur
- Person[obj]
- 1
- VERB-Fin: 'technohnotsa, techpaleuij, xtechtapowi, nechtokaytia, techpaleuia, tinechyolmelaw, titechtapowis, xnechtapowi
- 2
- VERB-Fin: timitstahtaniskeh, nimitsilia, nimitstahtania
- 3
- VERB-Fin: kikua, monamaka, kitoka, monotsa, kitokaj, kitokaytiaj, nimonotsa, kikwa, kipiya, kiluiaj
- 1
- Person[psor]
- 1
- NOUN: notokay, totaltikpak, tojomiuan, totalmanik, Tosemanauak, Totakayo, notatahwan, toatsin, todios, togalaxia
- 2
- NOUN: motokay, moaltepe
- 3
- NOUN: itech, ika, iteyo, itakka, iuan, ichokilo, iteyotsin, ixochio, ixochiyo, itakilo
- 1
- Person[subj]
- 1
- AUX: tiyetoskiyaj
- NOUN: nikayot
- PRON: tisekimeh
- PROPN: niMaria, niErnesto, niRuben
- VERB-Fin: nimonotsa, nikmati, nikixmattok, niwalew, nikita, nikmattok, tikiliah, nikmatia, tikitaj, niwalewa
- 2
- VERB-Fin: tikihtowa, tikixmati, timonotsa, 'technohnotsa, Tikihtowaya, tikixmattok, tikmatis, tiwalewa, xtechtapowi, 'nechili
- 3
- ADJ: sesek, istak, totonik, welik, uelik, kwali, tsikitsitsin, chichiltik, pisiltik, tsopek
- ADV: neli, tiotak
- AUX: eski, yetok, etok, eskia, uelij, yetokan
- NOUN: pahti, xiuit, nanakat, kuouit, pajti, xiwit, bolitas, chilaj, cimarrón, criollo
- VERB-Fin: mochiua, onkak, kikua, xochiyoua, monamaka, kualtia, kitoka, mochiwa, ixua, monotsa
- 1
- Subcat
- Intr
- AUX: uelij
- VERB-Fin: onkak, xochiyoua, kualtia, ixua, kwaltia, moskaltia, taki, kisa, oksi, peua
- Tran
- VERB-Fin: mochiua, kikua, monamaka, kitoka, mochiwa, monotsa, kitokaj, kitokaytiaj, nimonotsa, kikwa
- Intr
- Typo
- Yes
- ADV: Nika
- NUM: 1
- VERB-Fin: Nochiua, kiki
- Yes
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: etok.
- This corpus uses 5 lemmas as auxiliaries (aux). Examples: ma, hueli, neki, huetsi, nemi.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (1)
- VERB--PRON (1)
- VERB-Fin--NOUN (96)
- VERB-Fin--NOUN-Abs (93)
- VERB-Fin--PRON (536)
- VERB-Fin--PRON-Abs (10)
- obj
- VERB-Fin--NOUN (141)
- VERB-Fin--NOUN-Abs (107)
- VERB-Fin--PRON (59)
- iobj
- VERB-Fin--NOUN (5)
- VERB-Fin--NOUN-Abs (5)
- VERB-Fin--PRON (2)