home edit page issue tracker

This page pertains to UD version 2.

It appears that you have Javascript disabled. Please consider enabling Javascript for this page to see the visualizations.

UD Akuntsu TuDeT

Language: Akuntsu (code: aqz)
Family: Tupian

This treebank has been part of Universal Dependencies since the UD v2.7 release.

The following people have contributed to making this treebank part of UD: Carolina Aragon, Fabrício Ferraz Gerardi.

Repository: UD_Akuntsu-TuDeT
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15

License: CC BY-SA 4.0

Genre: nonfiction, news

Questions, comments? General annotation questions (either Akuntsu-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [fabricio • gerardi (æt) uni-tuebingen • de]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation	Source
Lemmas	annotated manually in non-UD style, automatically converted to UD
UPOS	annotated manually in non-UD style, automatically converted to UD
XPOS	annotated manually
Features	annotated manually in non-UD style, automatically converted to UD
Relations	annotated manually in non-UD style, automatically converted to UD

Description

UD_Akuntsu-TuDeT is a collection of annotated sentences in Akuntsú. The sentences stem from the grammatical description by Aragon (2014) and Aragon’s field work. Sentence annotation and documentation by Carolina Aragon, Fabrício Ferraz Gerardi, Luana dos Santos.

UD_Akuntsu-TuDeT is a collection of annotated sentences in Akuntsú. The sentences stem from the grammatical description by Aragon (2014) and Aragon’s field work. It is part of TuLaR, Tupían Language Resources. The project is work-in-progress and the treebank is being updated on a regular basis. Sentence annotation and documentation by Carolina Aragon, Fabrício Ferraz Gerardi, Luana dos Santos.

Text sources

Aragon, Carolina (2018) *Variações estilísticas e sociais no discurso dos falantes Akuntsú*. Revista Polifonia, v. 25, 90-103.
Aragon, Carolina (2018) *Posposições e marcadores oblíquos em Akuntsú (Tupí)*. Revista Brasileira de Linguística Antropológica, v. 10, 47-57.
Aragon, Carolina (2015) Considerações sobre os ideofones e seu uso em Akuntsú. Revista de Letras (Taguatinga), v. 8, 1-13.
Aragon, Carolina (2014) *A Grammar of Akuntsú, a Tupian language*. PhD dissertation, University of Hawaii, unpublished PhD dissertation.
Aragon, Carolina (2008) *Fonologia e aspectos morfológicos e sintáticos da língua Akuntsú*. Master thesis, Universidade de Brasília, unpublished master thesis.

Acknowledgments

The development of this treebank is supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 834050).

References

Statistics of UD Akuntsu TuDeT

POS Tags

ADJ – ADP – ADV – AUX – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – VERB

Features

Aspect – Case – Clusivity – Deixis – Determ – Foc – Mood – Nomzr – Number – NumType – Obl – Person – Person[psor] – Person[subj] – Polarity – PronType – Redup – Reflex – Rel – Tense – Trans – Tv – Voice

Relations

advcl – advmod – amod – appos – aux – case – ccomp – conj – dep – det – discourse – dislocated – iobj – nmod – nsubj – nummod – obj – obl – parataxis – punct – root – xcomp

Tokenization and Word Segmentation

This corpus contains 343 sentences, 1449 tokens and 1468 syntactic words.

All tokens in this corpus are followed by a space.

This corpus does not contain words with spaces.

This corpus does not contain words that contain both letters and punctuation.

This corpus contains 19 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
There are 15 types of multi-word tokens. Examples: ino, kɨrom, menerom, apiteperom, ataperom, ekerom, etom, iteterom, itʃoberom, jãjerom, kitʃetom, korom, mepiterom, taɨperom, tʃetom.

Morphology

Nominal Features

Number

Plur
- VERB: kitʃet

Sing
- NOUN: ikɨp
- PRON: en, on, erẽ, erẽbõ, orẽ, te, ebõ, enõ
- VERB: imaã, oewɨbɨka, oirika, ojã

Case

Abl
- NOUN: atʃiri, kɨrẽri, piri, tawtʃeri
- PRON: aroperi, ẽromri

All
- DET: jẽbõ, kebõ
- NOUN: tabɨtõ, kɨrẽbõ, ɨkɨbõ, ekõ, kirẽbõ, kojõpebõ, kojõpibõ, pabapebõ, pibõ, tekõ
- PRON: erẽbõ, enõ, tebõ
- PROPN: Kanibõ

Dat
- DET: kebõ
- PRON: orẽbõ, ebõ

Loc
- ADP: etʃe
- NOUN: ɨkɨpe, eanampe

Tra
- NOUN: kiakopna, kwena, menna, nakona, pitoana, takɨrapna, tatona, tawpɨkna, tawtʃena, emenna
- PART: ana

Degree and Polarity

Polarity

Neg
- ADV: nom, nõm, erom, rom, om

Verbal Features

Aspect

Hab
- VERB: oetara, koara, mira, etʃetara, kietara, kitʃetara, oamõjara, teipara

Iter
- VERB: ikiramkwatekwa

Mood

Ind
- VERB: kietara, kitʃetara

Tense

Fut
- PART: kom

Voice

Cau
- VERB: mõatʃoa

Pronouns, Determiners, Quantifiers

PronType

Emp
- PRON: erẽ, orẽ

Ind
- PRON: no

Prs
- PRON: en, on, orẽbõ, erẽbõ, kitʃe, te, enõ

NumType

Card
- NUM: tɨrɨ, kɨte, tɨɾɨ, tɨrɨtɨrɨtɨrɨ

Reflex

Yes
- AUX: tejã
- NOUN: pe, jen, po, teten, epo, opo, teatap, teimaj, teimi, teip
- PRON: tebõ
- VERB: teita, teeta, teimaj, teipara, tekwata, teakata, teaota, teera, teipa, tejã

Person

1
- AUX: ojã, otoa, kitoa
- NOUN: omepit, oike, otʃipap, oatap, oko, okɨp, okɨpi, otak, opo, itet
- PRON: on, orẽbõ, kitʃe, orẽ
- VERB: oerekwa, oetara, oamõja, opera, opip, otʃeta, otʃoa, itet, kietara, kipera

2
- AUX: ejã, eko, etoa
- NOUN: epo, eape, eboro, ekem, ekoro, epi, eti, eanampe, eiat, emenna
- PRON: en, erẽbõ, erẽ, on, ebõ, enõ
- VERB: koara, eeta, eneme, epekã, eata, eerekkwa, eimi, eipa, etʃera, etʃetara

3
- AUX: iko, iam, tejã, tejãkwa
- NOUN: ikɨp, iatap, iiw, imen, imepit, iten, itoap, itʃobe, itʃoke, tajtʃi
- PRON: i, te, tebõ
- VERB: teita, ikora, iat, ikoa, taot, teeta, iata, iekɨj, ijã, ikɨta

Other Features

Clusivity
- In
  - AUX: kitoa
  - PRON: kitʃe
  - VERB: kietara, kipera, kitʃet, kitʃetara

Deixis
- Dist
  - DET: ke, jẽrom, ta, tarom, ẽrom, kebõ
- Prox
  - DET: jẽ, ẽ, eme, kebõ, jẽbõ

Determ
- Yes
  - NOUN: eot

Foc
- Yes
  - PART: ne

Nomzr
- Circ
  - NOUN: atʃoap, nĩap, parãap, tʃogaap, pajapna
- Obj
  - NOUN: imi, iõ, imokwa, itʃopa, oiko, eiat, oiat, teimaj, teimi
  - VERB: iko, eimi, oiko

Obl
- Yes
  - NOUN: atitipe, kɨppe, pagoppe, oikepe

Person[psor]
- 1
  - NOUN: oiat
- 2
  - NOUN: epi
- 3
  - NOUN: teten

Person[subj]
- 2
  - NOUN: epi

Redup
- Yes
  - NOUN: kapakapa
  - NUM: tɨrɨtɨrɨtɨrɨ
  - VERB: kõjkõjkõj, nininia

Rel
- Cont
  - NOUN: tek, tep, tet, tanam, takɨma, tokwaj, otek, tekõ, tepna, tetna
  - VERB: itet, itetkwa

Trans
- Yes
  - NOUN: pɨtka, ipitka, iɨka
  - VERB: erekka, jãjka, pɨtka, tʃãka, oerekwa, amkwa, apeka, atabaka, buhka, erekkwa

Tv
- Yes
  - VERB: ata, koa, tʃopa, mia, nia, atʃoa, ikoa, koara, oetara, õa

Syntax

Auxiliary Verbs and Copula

This corpus does not contain copulas.

This corpus uses 7 lemmas as auxiliaries (aux). Examples: ko, tʃe, jã, ka, toa, am, piro.

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

nsubj
- VERB--NOUN (52)
- VERB--PRON (51)
- VERB--PRON-All (4)
- VERB--PRON-Dat (2)

obj
- VERB--NOUN (122)
- VERB--NOUN-ADP(pabape) (1)
- VERB--NOUN-All (2)
- VERB--PRON (1)

iobj

Verbs with Reflexive Core Objects

This corpus contains 5 lemmas that occur at least once with a reflexive core object (obj or iobj). Examples: at tekɨjt, ka epo, poro jen, tʃoga opo, õkwa po

Relations Overview

This corpus does not use relation subtypes.
The following 15 relation types are not used in this corpus at all: csubj, vocative, expl, cop, mark, acl, clf, cc, fixed, flat, compound, list, orphan, goeswith, reparandum