UD Gwichin TueCL
Language: Gwichin (code: gwi
)
Family: Na-Dene
This treebank has been part of Universal Dependencies since the UD v2.14 release.
The following people have contributed to making this treebank part of UD: Matthew Andrews, Çağrı Çöltekin.
Repository: UD_Gwichin-TueCL
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: grammar-examples
Questions, comments? General annotation questions (either Gwichin-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [matthew • andrews (æt) student • uni-tuebingen • de, cagri • coeltekin (æt) uni-tuebingen • de]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | not available |
Relations | annotated manually, natively in UD style |
Description
UD_Gwichin-TueCL is a small treebank of Alaskan Gwich’in, an endangered Athabascan language, based on material located in the Alaska Native Language Archive.
UD_Gwichin-TueCL is a small treebank of Alaskan Gwich’in, the first treebank of an Athabascan language in UD. Gwich’in, also known as Dinjii zhuh ginjik, is an endangered language spoken in Alaska and Canada by no more than 500 people.
This treebank began as a course project at the University of Tübingen in Germany. The data used in this treebank is based on material located in the Alaska Native Language Archive in Fairbanks, Alaska, the developer’s hometown.
Acknowledgments
I want to thank the native Gwich’in speakers and the Doyon Foundation whose materials I’ve used to learn the Gwich’in language. I’d also like to thank Michael Krauss and Siri Tuttle who inspired my interest in Athabascan languages.
References
- Alaska Native Language Archive. n.d. Gwich’in collection - Alaska Native Language Archive (ANLA). Accessed: 15-Oct-2024.
- Matthew Andrews. 2023. Dictionary projects.
- John Busch. 2000. Finding your way through a story: Direction terms in Gwich’in narrative. University of Alaska Fairbanks.
- Scott T Bushey. 2021. Western Gwich’in classificatory verbs. University of Alaska Fairbanks.
- Doyon Foundation. n.d. Doyon languages online. Accessed: 15-Oct-2024.
- Gwich’in Social and Cultural Institute. n.d.a. Gwich’in language store. Accessed: 15-Oct-2024.
- Gwich’in Social and Cultural Institute. n.d.b. Gwich’in online dictionary. Accessed: 15-Oct-2024.
- Patrick Marlow and Lillian Garnett. 1996. Beginning Athabaskan Gwich’in ANL142. University of Alaska Fairbanks.
- C. Mishler and K. Frank. 2019. Dinjii vadzaih dhidlit. IPI.
- Dick Mueller and Lillian Garnett. March 1994. Western Gwich’in topical dictionary. Alaska Native Language Center and the Summer Institute of Linguistics.
- Katherine Peter. 1979. Dinjii zhuh ginjik nagwan tr’iłtsaii: Gwich’in junior dictionary. Alaska Native Language Center.
Statistics of UD Gwichin TueCL
POS Tags
ADJ – ADP – ADV – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Relations
acl – advcl – advmod – amod – appos – case – cc – ccomp – compound – conj – dep – det – discourse – fixed – flat – iobj – mark – nmod – nsubj – nummod – obj – obl – punct – reparandum – root – vocative
Tokenization and Word Segmentation
- This corpus contains 313 sentences and 1008 tokens.
- This corpus contains 324 tokens (32%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 208 types of words that contain both letters and punctuation. Examples: ts’ą̀’, Ch'adhah, dąį’, Ch’ilik, gwats’an, neenahąąl’yàa, ts’eh, ts’à’, Kǫ’, K’ǫǫ, Noh’in, ałch’yaa, ch’ih’àa, dòonch’yàa, geet’ihthan, geh’àn, gwiink’oo, gwìłts’ìk, hąąh’yaa, ihtł’uu, intł’uu, iłts’ik, kèeshi’ìn, ni’įį, shi’įį, tr’ąąh’in, tseegii’in, yagha’, ąhch’yaa, Ak'ìi, Alk'ìi, Ank'ìi, Ch'ahakhwanjyaa, Ch'ahan, Ch'akhwanii, Ch'anjaa, Ch'igiheenjyaa, Ch'igiinii, Ch'iheenjyaa, Ch'ihininjyaa, Ch'ihishinjyaa, Ch'iinii, Ch'iitąįį, Ch'in'àl, Ch'ininii, Ch'ir'iinii, Ch'iriheenjyaa, Ch'ishinii, Ch'itł’eets’al, Ch’aga’àa
Morphology
Tags
- This corpus uses 15 UPOS tags out of 17 possible: ADJ, ADP, ADV, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: AUX, SYM
- This corpus contains 3 word types tagged as particles (PART): kwaa, kwàa, nąįį
- This corpus contains 2 lemmas tagged as pronouns (PRON): jidìi, shįį
- This corpus contains 3 lemmas tagged as determiners (DET): aii, izhik, yagha’
- This corpus contains 0 lemmas tagged as auxiliaries (AUX):
- This corpus does not use the VerbForm feature.
Nominal Features
Degree and Polarity
Verbal Features
Pronouns, Determiners, Quantifiers
Other Features
Syntax
Auxiliary Verbs and Copula
- This corpus does not contain copulas.
- This corpus does not contain auxiliaries.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (26)
- VERB--PRON (1)
- obj
- VERB--NOUN (96)
- VERB--PRON (1)
- iobj
- VERB--NOUN (1)