UD Hausa NorthernAutogramm
Language: Hausa (code: ha
)
Family: Afro-Asiatic
This treebank has been part of Universal Dependencies since the UD v2.14 release.
The following people have contributed to making this treebank part of UD: Bernard Caron.
Repository: UD_Hausa-NorthernAutogramm
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: spoken
Questions, comments? General annotation questions (either Hausa-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [bernard • l • caron (æt) gmail • com]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
This treebank contains data of Northern Autogramm, for the Ader dialect of Niger Republic (Northern Hausa).
The Ader (Northern) Hausa, together with the Sokoto variety, is a more archaic version of Standard Hausa, where some phonological rules have not applied.
The treebank contains 400 sentences and 3,919 tokens.
It is maintained in the SUD framework: SUD_Hausa-NorthernAutogramm and converted automatically in UD.
Acknowledgments
References
- Caron, Bernard. 1991. Le haoussa de l’Ader (Sprache und Oralität in Afrika). Vol. 10. Berlin: D. Reimer. https://www.academia.edu/110044586/Caron_1991_Le_haoussa_de_lAder?sm=b.
Statistics of UD Hausa NorthernAutogramm
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Aspect – Case – Definite – Deixis – ExtPos – Gender – Mood – Number – PartType – Person – Polarity – PronType – Reflex – Tense – VerbForm – Voice
Relations
acl – acl:relcl – advcl – advcl:cleft – advmod – amod – appos – aux – case – cc – cc:preconj – ccomp – compound – conj – cop – dep – det – discourse – dislocated – fixed – flat:name – iobj – mark – nmod – nsubj – obj – obl – obl:arg – parataxis – punct – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 423 sentences, 4119 tokens and 4248 syntactic words.
- All tokens in this corpus are followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 43 types of words that contain both letters and punctuation. Examples: sa'ànnan, aː'àː, sa’ànnan, s'ayàː, ta', waːc'èː, ya', baː'à, du', yas', aː’àː, duːc'ìː, gàyyaː., kaːc'èː, koː'ìnaː, taːs'às, wa'ànda, àrbà'in, a', c’eːrèː, c’ìnka, dà', geːmèː!//], gùdaː-gudàn, ha', he'è:, his'ariː, ki', kwànce-kwancèn, kwànce-kwànce, kàmas', láːtà'addù, nà:, nân., s'aisà, s'àkaːniː, s'àkaːnìn, sà'addà, taːs'àt, wàhalà', ƴa', ɗan'ubancìː, ḿː'm̀ː
- This corpus contains 129 multi-word tokens. On average, one multi-word token consists of 2.00 syntactic words.
- There are 87 types of multi-word tokens. Examples: ankài, shìgattà, kukài, shikài, mis, raːnakkà, uwattà, uwaːtai, yâːtaː, àbinkà, bambanciyassù, baːyuːnai, bân, cikinsù, gardamàssu, rânta, sukài, sunkài, sàːmai, tai, tassan, ƙarhiːnai, abìnga, akài, askaː, bambanciyaːtai, baːyunkì, baːyuːna, biyànka, bàːkinsù, bâsshì, c’ìnkai, dumèːnai, duːkìyattà, ganai, ganiːnai, gidankà, gidansù, gwàlmaːtai, indà, iːkòːnai, jàkkainaː, jàkkankì, jàlloːnai, jèːwàyèsshi, ka, kai, ki, kirànka, kunyàːtai.
Morphology
Tags
- This corpus uses 16 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: SYM
- This corpus contains 21 word types tagged as particles (PART): ba, baːbù, bàː, bâː, dai, dà, gàː, hakàn, hwa, kòː, mài, na, naː, nà:, nàː, shìn, ta, wai, zâ, zâː, àkwai
- This corpus contains 61 lemmas tagged as pronouns (PRON): =ai, =ka, =ki, =ku, =kà, =kù, =nai, =naː, =shi, =su, =sù, =ta, =tai, =tay, =taː, =tà, =ya, eː, hm̂ː, ita, ka, kai, keː, ki, koːmi, koːmiː, koːwaː, kà, kâinai, kânka, mai, mat, matà, maː, min, miː, musù, naːkù, naːshì, naːtà, ni, niː, níː, shi, shiː, shì, su, suː, sà'addà, ta, taːkà, taːsù, wandà, wani, wàccan, wàgga, wàncân, wàncéːnìyaː, wànga, wànnan, wâggàːshi
- This corpus contains 10 lemmas tagged as determiners (DET): can, dukà, ga, nan, su, waccè, wani, wanèː, wàccân, yak
- Out of the above, 2 lemmas occurred sometimes as PRON and sometimes as DET: su, wani
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): _
- There are 2 (de)verbal forms:
- Part
- VERB: bìye, tàhe, tsàye, kwànce, màːlìye, tàushe, zàmne
- Vnoun
- VERB: tàhiyàː, yîː, zakkùwaː, sôn, bìyash, gàmuwaː, cîn, kwaːnaː, gudùː, hwaːɗùwaː
Nominal Features
- Fem
- ADJ: wacèː, ƴag, ƴak
- AUX: tanàː, tac, tà, taː, takè, tay, tas, ta', tab, tak
- DET: wata, wàccân, tan, waccè
- NOUN: dàːmisàː, kuːraː, gàyyaː, duːniyàː, yâː, raːnaː, shìgat, kwaːnaː, raːnak, rân
- PART: ta
- PRON: ita, =tà, wàccan, tà, =ta, matà, ta, =kì, wàgga, keː
- VERB: tàhiyàː, zakkùwaː, bìyash, gàmuwaː, gàyyaː., hwaːɗùwaː, kankaryaː, sarɓaː, bìɗaː, cêːwaː
- VERB-Vnoun: tàhiyàː, zakkùwaː, bìyash, gàmuwaː, hwaːɗùwaː, kankaryaː, sarɓaː, bìɗaː, cêːwaː, daɗèːwaː
- Masc
- ADJ: ɗan, baƙiː, hwarin, hwariː, namijì, ƙàramiː, baƙin, hìyayyem, jan
- AUX: yac, shinàː, yaː, kaː, shì, yaz, kà, yat, yay, yah
- DET: wani, wanèː
- NOUN: kàreː, gidaː, sâː, yaːɗaː, zàkaràː, ɓiki, mùtun, àbin, mùzuːruː, sarkin
- PRON: shiː, =tai, shì, shi, =nai, =kà, mai, =ka, kai, kà
- PROPN: Buːzuː, Bàhaushèː, Bàgawailèː
- VERB: yîː, sôn, cîn, kwaːnaː, gudùː, sauraːreː, taːshìː, hwaɗìː, yîn, zaman
- VERB-Vnoun: yîː, sôn, cîn, kwaːnaː, gudùː, taːshìː, hwaɗìː, yîn, zaman, zamaː
- Plur
- ADJ: maːtaː
- AUX: sunkà, sunàː, sun, sukà, sù, kukà, kun, sukè, kukè, bàkù
- DET: su, wasu
- NOUN: ruwaː, mutàːneː, ƴan, giːwàːyeː, zàːrùmmai, baːyuː, kuɗɗiː, maːtaː, ƙwàːriː, cinàn
- PRON: =sù, suː, musù, sù, =su, su, wa'ànda, =kù, wasu, =ku
- PROPN: Tuːraːwaː, Buːzàːyeː, Hausaːwaː, Baːgayaːwaː, Gàwàllai
- Sing
- AUX: ìn, naː, inàː, bàn, ani, nikà, nim, nis, nish, niy
- PART: mài
- PRON: niː, min, =na, ni, =taː, =naː, nì, shiː
- Ben
- ADP: mà
- PRON: mai, musù, maː, min, matà, maw
- Gen
- PRON: =nai, =tà, =tai, =kà, =su, =kù, =ta, naːshì, =kì, naːkù
- Nom
- PRON: shiː, ita, niː, suː, kai, keː, shi
- Cons
- ADJ: ɗan, hwarin, jan, baƙin, hìyayyem, ƴag, ƴak
- NOUN: àbin, sarkin, bàːkin, kàram, shìgat, ƴan, ɗan, loːkàcin, wurin, yâː
- PART: na, ta
- PROPN: Ìlleːlàg
- VERB: sôn, bìyash, cîn, tàhiyàː, taːs'às, tàhiyàk, yîn, zaman, ɓaːcìn, aihùwaz
- VERB-Vnoun: sôn, bìyash, cîn, taːs'às, tàhiyàk, tàhiyàː, yîn, zaman, ɓaːcìn, aihùwaz
- Def
- ADV: nan
- DET: nan, tan
- NOUN: wurîn, rân, abìn, dumèn, gàrîn, sân, ɗan, duːniyàg, loːkàcîn, làːbàːrûn
- PRON: wànnan
- Spec
- DET: wani, wasu
- PRON: wani, wasu
Degree and Polarity
- Neg
- AUX: bài, baː'à, baːkà, bàkà, bàkù, baːmù, bàn, bàtà, baːkù, bàsù
- PART: baːbù, ba, bâː, bàː
Verbal Features
- Aor
- AUX: shì, à, ìn, tà, kà, sù, kì, kù, mù
- Iter
- PART: ta
- Perf
- AUX: yac, ankà, sunkà, yaː, kaː, tac, yaz, naː, taː, yat
- Jus
- VERB: bàri, shìga, dìːba, i, rùmaː, tàhi, wùceː, saː, shìryaː, ƙàːraː
- Fut
- AUX: zâːku, zâːshi
- Cau
- VERB: tassheː, bâsshee, hîrkassheː, jèːwàyès, s'aisà, tassà, tâːkassà, tàssa
- Stat
- VERB-Part: bìye, tàhe, tsàye, kwànce, màːlìye, tàushe, zàmne
Pronouns, Determiners, Quantifiers
- Art
- DET: nan, tan
- Ind
- ADV: koː'ìnaː
- DET: wata, wani, wasu
- PRON: koːmiː, koːwaː, wani, wasu, wâggàːshi, wata
- Int
- ADJ: wacèː
- ADV: ƙàːƙàː, ìnaː
- DET: wanèː, waccè
- PRON: miː
- Prs
- PRON: shiː, =tai, ita, shì, shi, =tà, =nai, =sù, =kà, mai
- Rel
- ADV: indà, indàduk, duwwàdà, indàdun, indàdut, koːìnaː, koːƙàːƙàː, kóːkòːindà, wàdà
- PRON: sà'addà, wandà, wa'ànda
- Tot
- ADV: duh
- DET: dug, du', dus, duy, dub, dukà, dun, dut, duk, dûn
- Yes
- PRON: kâinai, kânka
- 1
- AUX: ìn, naː, inàː, mukà, mù, baːmù, bàn, mun, ani, munkà
- PRON: niː, min, =na, ni, =taː, =naː, nì, shi
- 2
- AUX: kaː, kà, kukà, kun, bàkà, bàkù, kanàː, kat, kì, kù
- PRON: =kà, =ka, kai, kà, maː, =kì, ka, =kù, keː, =ku
- 3
- AUX: yac, sunkà, yaː, shì, shinàː, tac, tà, yaz, tanàː, taː
- PRON: shiː, =tai, ita, shì, =tà, shi, =nai, =sù, mai, suː
- 4
- AUX: ankà, akà, à, anàː, baː'à, an
Other Features
- Deixis
- Prox
- ADV: nân, nanânga, nân.
- DET: ga, wàccân
- PRON: wàgga, wànga, wànnan
- Prox
- ExtPos
- ADJ
- ADJ: hìyayyem
- ADV
- ADV: sai
- SCONJ: ha', har
- VERB: bìye, tàhe, kwàːni, taːshì, tsàye, kwànce, màːlìye, tàushe
- VERB-Part: bìye, tàhe, tsàye, kwànce, màːlìye, tàushe
- NOUN
- PART: mài
- PRON: =ta
- VERB-Vnoun: tàhiyàː, yîː, zakkùwaː, sôn, bìyash, gàmuwaː, cîn, kwaːnaː, gudùː, hwaːɗùwaː
- PRON
- ADV: duh
- DET: du', dut, duk, dus, duy, dûn
- ADJ
- PartType
- Int
- CCONJ: koː
- PART: ba, hwa, shìn
- Int
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: _.
- This corpus uses 1 lemmas as auxiliaries (aux). Examples: _.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (28)
- VERB--PRON (8)
- VERB-Vnoun--NOUN (1)
- obj
- VERB--NOUN (116)
- VERB--NOUN-ADP(dà) (1)
- VERB--PRON (45)
- VERB--PRON-Nom (1)
- VERB-Vnoun--NOUN (1)
- VERB-Vnoun--PRON (1)
- iobj
- VERB--PRON (10)
- VERB--PRON-Ben (31)
Verbs with Reflexive Core Objects
- This corpus contains 1 lemmas that occur at least once with a reflexive core object (obj or iobj). Examples: han- kâinai