UD Khunsari AHA
Language: Khunsari (code: kfm
)
Family: IE
This treebank has been part of Universal Dependencies since the UD v2.7 release.
The following people have contributed to making this treebank part of UD: AmirHossein Mojiri Foroushani, Hamid Aghaei, Amir Ahmadi.
Repository: UD_Khunsari-AHA
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: grammar-examples, spoken
Questions, comments? General annotation questions (either Khunsari-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [amojiry (æt) gmail • com]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | annotated manually |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
The AHA Khunsari Treebank is a small treebank for contemporary Khunsari. Its corpus is collected and annotated manually. We have prepared this treebank based on interviews with Khunsari speakers.
Khunsari treebank consist of 10 sentences of this stage. We are trying to make this corpus bigger day by day. AHA is a small group, tries to analyze Iranian language and find their similarities and differences.
Acknowledgments
Theses sentences were prepared with the help of Vaneshani people. On behalf of the AHA group, Mr. Mohammad Hossein Mashayekhi is thanked. Also, Ms. Hanieh Mashayekhi sincerely helped us to translate the sentences. First, we used the sentences suggested by APLL (Academy of Persian Language and Literature) to collect Iranian languages. This project is a research project by AmirHossein, Hamid and Amir (AHA).
You can use this structure to refer to this project:
- Mojiri Foroushani, AmirHossein; Aghaei, Hamid; Ahmadi, Amir (2020): “AHA Khunsari dependency treebank”, Universal dependencies (universaldependencies.org)
Statistics of UD Khunsari AHA
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – NOUN – NUM – PRON – PUNCT – SCONJ – VERB
Features
Case – Degree – Mood – Number – NumType – Person – Polarity – PronType – Tense – VerbForm
Relations
advcl – advmod – amod – aux – case – cc – ccomp – compound – compound:lvc – flat – mark – nmod – nmod:poss – nsubj – nummod – obj – obl – punct – root
Tokenization and Word Segmentation
- This corpus contains 10 sentences and 74 tokens.
- This corpus contains 10 tokens (14%) that are not followed by a space.
- This corpus does not contain words with spaces.
- This corpus does not contain words that contain both letters and punctuation.
Morphology
Tags
- This corpus uses 11 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, NOUN, NUM, PRON, PUNCT, SCONJ, VERB
- This corpus does not use the following tags: PROPN, DET, PART, INTJ, SYM, X
- This corpus contains 7 lemmas tagged as pronouns (PRON): ا, او, ش, م, مو, مون, مُن
- This corpus contains 0 lemmas tagged as determiners (DET):
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): دار
- There are 1 (de)verbal forms:
- Part
- VERB: اِداجِن, بشتون
Nominal Features
- Plur
- NOUN: پِلا
- VERB-Part: اِداجِن
- Sing
- ADV: روزی
- AUX: دارُن
- NOUN: اُ, بار, برنجِن, بِرا, تا, درختِ, رِختا, ساعت, سال, عبدالله
- PRON: م, او, ش, ما, مون, مُن, نا
- VERB: آکَ, ئُ, اِمِگوا, اِچُ, اِکِرُ, بشتون, جیر, دَرکَفتُن, دِ, مِکِرُن
- VERB-Part: بشتون
- Loc
- ADV: بئون
- Tem
- ADV: اِزِ, اِطون, حالا
Degree and Polarity
- Pos
- ADJ: چند
- Neg
- VERB: ندیُ, نَچو
Verbal Features
- Imp
- VERB: دِ
- Sub
- VERB: جیر
- Fut
- AUX: دارُن
- Past
- VERB: آکَ, اِمِگوا, دَرکَفتُن
- Pres
- VERB: ئُ, اِچُ, اِکِرُ, جیر, مِکِرُن, ندیُ, نَچو, کَ
Pronouns, Determiners, Quantifiers
- Dem
- PRON: مون
- Prs
- PRON: م, او, ش, ما, مُن, نا
- Card
- NUM: دِی, یَ, یَک
- 1
- AUX: دارُن
- PRON: م, ما, مُن, نا
- VERB: آکَ, اِمِگوا, بشتون, جیر, دَرکَفتُن, مِکِرُن, ندیُ
- VERB-Part: بشتون
- 2
- VERB: دِ
- 3
- PRON: او, ش
- VERB: ئُ, اِداجِن, اِچُ, اِکِرُ, نَچو, کَ
- VERB-Part: اِداجِن
Other Features
Syntax
Auxiliary Verbs and Copula
- This corpus does not contain copulas.
- This corpus uses 1 lemmas as auxiliaries (aux). Examples: دار.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (4)
- VERB--PRON (3)
- obj
- VERB--NOUN (3)
- VERB--PRON (1)