UD Hausa SouthernAutogramm
Language: Hausa (code: ha
)
Family: Afro-Asiatic
This treebank has been part of Universal Dependencies since the UD v2.14 release.
The following people have contributed to making this treebank part of UD: Bernard Caron.
Repository: UD_Hausa-SouthernAutogramm
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: spoken
Questions, comments? General annotation questions (either Hausa-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [bernard • l • caron (æt) gmail • com]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
This treebank contains data of Southern Autogramm, for the Zaria dialect of Nigeria (Southern Hausa).
The Zaria (Southern) Hausa, is a “modern” version of the language where the 3-way opposition (masculine / feminine / plural) has been abandoned in the noun system, and only the plurality feature is maintained, while the feminine gender is kept in the pronominal and TAM system.
The treebank contains 1,918 sentences and 14,585 tokens.
It is maintained in the SUD framework: SUD_Hausa-SouthernAutogramm and converted automatically in UD.
Acknowledgments
References
Caron, Bernard. 2015. Hausa Grammatical Sketch. In Amina Mettouchi, Martine Vanhove & Dominique Caubet (eds.), Corpus-based Studies of Lesser-described Languages. The CorpAfroAs corpus of spoken AfroAsiatic languages. Amsterdam-Philadelphia: John Benjamins. https://halshs.archives-ouvertes.fr/halshs-00647533.
Statistics of UD Hausa SouthernAutogramm
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB – X
Features
Aspect – Case – Definite – Deixis – ExtPos – Foreign – Gender – Number – PartType – Person – Polarity – PronType – Reflex – Tense – VerbForm – Voice
Relations
acl – acl:relcl – advcl – advcl:cleft – advmod – amod – appos – aux – case – cc – cc:preconj – ccomp – compound – compound:svc – conj – cop – csubj – dep – det – discourse – dislocated – fixed – flat – flat:foreign – flat:name – iobj – mark – nmod – nsubj – nummod – obj – obl – obl:arg – parataxis – parataxis:parenth – punct – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 1918 sentences and 14585 tokens.
- All tokens in this corpus are followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 46 types of words that contain both letters and punctuation. Examples: na'àm, sa'ànnan, zaː'à, baː'àː, sànaː'àn, mêː-mêː, du', Mà'aːzù, kalàː-kalàː, loːkàci:, sa'àn, zaːmàni:, Tudùn-Wàda, Tudùn-Wàdân, baƙa-baƙaː, baː'à, dakà:, es-ès, eː'àː, gar̃gajiya:, gaːdìː-n, gishiri-gishiri, gàriː-ǹ, gìne-gìne, ha', hanyà:, hanyàː-n, haʔà:, irìː-irìː, irìː-n, jeːfì-jeːfì, ka:, koːwa:, lâifiː-n, mà'àːnaː, m̀:hm:, r̃uwa-r̃uwa, shùːke-shùːke, su:, sàbà'in, tsoːhoː-n, àl'amur̃àː, ƙaɓe-ƙàɓè, ƙuliː-ƙulin, ƴan'uwân, ɗai-ɗai
Morphology
Tags
- This corpus uses 16 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, VERB, X
- This corpus does not use the following tags: SYM
- This corpus contains 45 word types tagged as particles (PART): ba, baːbù, bà, bà~, bàː, bâ, bâː, cân, cèː, dai, dà, fa, fâ, gaːra, gà, gàː, ha, kar̃, kaɗà, koː, kuma, kàm, kâi, kènan, kèːnan, kòː, kùwa, kùwâː, maː, mm, mài, màːsu, mêː, mêː-mêː, na, neː, nèː, sòː, ta, wàːtòn, wàːtòː, zâː, àkwai, àʔàː, ɗin
- This corpus contains 84 lemmas tagged as pronouns (PRON): =kà, =kì, =kù, =mu, =mù, =naː, =shi, =shì, =su, =sù, =ta, =tà, =ya, dukà, ita, ka, kai, keː, ki, koːmeː, koːwa:, koːwaː, ku, kuː, kâinaː, kânkà, kânmù, kânshì, kântà, kù, makà, manà, manàː, mashì, masà, masù, matà, mikì, mishì, mu, mukù, musù, muː, mâi, mèneːnèː, mèː, mèːneː, mèːneːnèː, mîn, mù, naːkà, naːkù, naːmù, naːshì, naːsù, naːtà, ni, niː, nàːwa, nàːwaː, shi, shiː, su, su:, suː, sù, ta, taːkù, tà, wancàn, wandà, wani, wannàn, waɗàndà, waɗànnan, waː, waːnè, wutaː, wànnan, wàː, wàːneː, wàːneːnèː, wânnan, ʔàʔè
- This corpus contains 17 lemmas tagged as determiners (DET): dukà, nan, nàn, nân, su, wancàn, wancàː, wani, wannàn, wasu, wata, waɗànnan, wànneː, wànè, wânnan, ɗin, ɗîn
- Out of the above, 7 lemmas occurred sometimes as PRON and sometimes as DET: dukà, su, wancàn, wani, wannàn, waɗànnan, wânnan
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): _
- There are 2 (de)verbal forms:
- Part
- VERB: zàune, jìbge, kwànce
- Vnoun
- VERB: noːman, noːmaː, yîː, zuwàː, saːmùn, yîn, jîn, cîː, sôː, tunàːwaː
Nominal Features
- Fem
- ADJ: ƴar̃
- AUX: ta, tà, taː, tanàː, kin, takàn, zaːtà, kikà, kì, bàtà
- DET: wata
- NOUN: dòːdannìyaː, mangàr̃àɗ, marigâyyaː, ƙoːfàr̃, ƴar̃
- PART: cèː
- PRON: ita, =tà, tà, ta, matà, keː, naːtà, =kì, ki, wata
- Masc
- AUX: ya, yaː, kaː, kà, yà, yanàː, ka, zâi, kanàː, bàkà
- DET: wani
- NOUN: mahàifin
- PRON: shiː, shi, =shì, shì, mishì, makà, kai, =shi, mashì, =kà
- Plur
- ADJ: ƴan, tsòːfàffin
- AUX: sukà, mukà, mù, munàː, sun, sunàː, mukàn, mun, kukàn, sù
- DET: su, wasu, waɗànnan
- NOUN: shaːnuː, mutàːneː, dabboːbiː, abuːbuwàː, yâːraː, abuːbuwàn, saiwoːyiː, maːtan, riːjiyoːyiː, ƴaːƴan
- PART: màːsu
- PRON: =sù, suː, =mù, manà, muː, mù, naːmù, su, mu, sù
- PROPN: Fulàːniː, Filàːniː, Kanaːwaː, Katsinaːwaː, Tuːr̃aːwaː, Bàfilàːnin, Fulàːnîn, Sakkwataːwaː
- VERB: caccànzaː, masàyaː, ciccìkaː, daddàurè, r̃ar̃r̃àbaː, tattàːrà, tàttàfi, yanyànkà, yâːraː
- Sing
- AUX: naː, zân, zaːkà, inàː, ìn, na, zaːʼà, bàn, bân, zaːkì
- PRON: niː, mîn, =naː, ni, nì, nàːwa, kâinaː, nàːwaː
- Dat
- ADP: mà, wà
- PRON: mishì, mîn, makà, manà, mashì, matà, masù, mâi, masà, mikì
- Gen
- PRON: =naː, =tà, naːkù, naːsù, nàːwa
- Nom
- PRON: shiː, ita, niː, suː, muː, ni, kai, mu, duy, shi
- Cons
- ADJ: ainihin, farin, saːbon, baƙin, bàbban, kaurin, tsantsan, tsoːhon, tsòːfàffin
- ADP: irìn
- ADV: bana, yànzûn
- DET: waɗànnan
- NOUN: àbin, gidan, suːnan, gàrin, irìn, sauran, loːkàcin, ruwan, goːnan, tsaːmiyan
- NUM: sìttin, tàlàːtin, àshìr̃in, ɗayan, ɗàr̃in, goːmàn
- PART: na, ɗin
- PROPN: Ùngwan, Gùndumàn, Fulàːniː, Bàtuːr̃èn, Maːlàn, Muːsa, Saːnin, Ɗan, Bàfilàːnin, Sakkwataːwaː
- VERB: noːman, saːmùn, yîn, jîn, neːman, kiràn, sôn, cîn, ganin, saːran
- VERB-Vnoun: noːman, saːmùn, yîn, jîn, neːman, kiràn, sôn, cîn, ganin, saːran
- X: kùr̃ùngùn
- Def
- ADV: nan
- DET: wânnan, nan, ɗîn, waɗànnan, wànè, wannàn, wànneː
- NOUN: àbîn, loːkàcîn, àbin, irìn, wân, ƙanèn, daːjìn, dàliːlìn, yaːrinyàn, gidân
- NUM: àr̃bàʼin, àshìr̃in, mìliyàn, sàbà'in, sàbàʼin, sìttin
- PART: wàːtòn
- PRON: wânnan, wànnan, waɗànnan
- PROPN: Bàːsân, Filàːnîn, Ìsìlàːmiyàn, Fulàːnîn
- VERB-Vnoun: noːmân
- Spec
- DET: wani, wasu, wata
- PRON: wani
Degree and Polarity
- Neg
- AUX: bàkà, bàn, bài, bàmù, bàʼà, baː'àː, bâi, baːkàː, baːmàː, baːyàː
- PART: ba, bàː, bà, bâː, baːbù, kar̃, kaɗà, bà~, bâ
Verbal Features
- Aor
- AUX: kà, à, yà, mù, tà, sù, ìn, kù, kì, shì
- Hab
- AUX: mukàn, kukàn, takàn, yakàn, akàn, sukàn
- Iter
- PART: ta
- Perf
- AUX: yaː, kaː, an, naː, sun, mun, taː, kin, kun, am
- PerfBkg
- AUX: ya, ta, akà, sukà, mukà, ka, na, kikà, kukà, kakèː
- PerfNeg
- AUX: bàkà, bàn, bài, bàmù, bàʼà, bàsù, bàtà, bâi, bàkì, baːyàː
- Prog
- AUX: yanàː, anàː, munàː, sunàː, inàː, nàː, kanàː, tanàː, kunàː, kinàː
- ProgBkg
- AUX: akèː, mukèː, kukèː, yakèː, sukèː, kèː, kakèː, kukà, takèː, kikèː
- ProgLocBkg
- AUX: yakè, mukè, sukè, kakè, kukè, nakè, takè, akè, nikè, shikè
- ProgNeg
- AUX: baː'àː, baːkàː, baːmàː, baːkà, baːyàː, bân, baːtà, baː'à, baːnàː, baːsàː
- Fut
- AUX: zâi, zân, zaːkà, zaːsù, zaːʼà, zaːtà, zaː'à, zaːkì, zaːmù, zaːʔà
- Pred
- AUX: kyâː, kâː, mwâː, tâː, âː
- Cau
- VERB: sayar̃, s~
- Stat
- VERB-Part: zàune, jìbge, kwànce
Pronouns, Determiners, Quantifiers
- Dem
- ADV: nân
- DET: wannàn, nàn, wancàn, nân, wânnan
- PRON: wannàn, wancàn, wànnan
- Ind
- DET: wani, wasu, wata, wânnan
- PRON: koːmeː, koːwaː, wani, koːmiː, koːwa:, waːnè
- Int
- ADV: ìnaː, yàːyàː, yàushèː, yàyàː, yàː
- DET: wànè
- NUM: nawà
- PRON: mèː, mèːneːnèː, wàː, mèneːnèː, mèːneː, wàːneː, wàːneːnèː, wǎːi, wâː
- Prs
- PRON: shiː, shi, ita, =shì, =tà, shì, =sù, suː, mishì, tà
- Rel
- ADV: yandà, indà, yaddà
- PRON: wandà, waɗàndà
- Tot
- DET: dug, duk, dun, dus
- PRON: dukà, duk, dun, duy
- Yes
- PRON: kânmù, kânshì, kâinaː, kâmmù, kânkà, kântà
- 1
- AUX: mukà, naː, mù, zân, munàː, inàː, mukàn, mun, ìn, mukèː
- PRON: niː, mîn, =mù, manà, muː, mù, =naː, naːmù, ni, mu
- 2
- AUX: kaː, kà, ka, zaːkà, kanàː, kukàn, kin, bàkà, kukèː, kakèː
- PRON: suː, kà, kai, makà, ka, keː, ki, dukà, ku, mukù
- 3
- AUX: ya, yaː, ta, sukà, yà, yanàː, zâi, sun, tà, sunàː
- PRON: shiː, shi, ita, =shì, =tà, shì, =sù, mishì, tà, ta
- 4
- AUX: akà, à, an, anàː, akèː, zaː'à, akàn, bàʼà, kà, baː'àː
- PRON: makà, mâː
Other Features
- Deixis
- Prox
- ADV: nân
- DET: wannàn, nàn, nân, wânnan
- PRON: wannàn
- Remt
- DET: wancàn
- PRON: wancàn, wànnan
- Prox
- ExtPos
- ADJ
- NOUN: mài
- PART: mài
- ADP
- ADP: à, har̃, ta
- NOUN: kàmaː
- PRON: dud
- ADV
- ADP: had, à
- ADV: sai, keː, tàːre
- NOUN: gàbaː, gidan
- SCONJ: dà, kàman, wai
- VERB: zàune, kwànce
- VERB-Part: zàune, kwànce
- NOUN
- NOUN: mài, tsoːhuwaː, har̃kàn, niːsaː, noːmân, sigàː, yawàː, zoːmoː, zufàː, ƙùllun
- NUM: goːmàn
- PART: mài
- PROPN: Basaːwaː
- VERB: noːman, noːmaː, yîː, zuwàː, saːmùn, yîn, jîn, cîː, sôː, tunàːwaː
- VERB-Vnoun: noːman, noːmaː, yîː, zuwàː, saːmùn, yîn, jîn, cîː, sôː, tunàːwaː
- PRON
- NOUN: àbin
- PRON: wandà
- SCONJ
- ADP: kàman
- SCONJ: in, koː, koːdà
- VERB
- VERB: ci
- ADJ
- Foreign
- Yes
- ADV: especia~
- CCONJ: but
- INTJ: OK
- NOUN: poison, police, kilaːs, Eːbìːyù, TV, bìr̃êːk, chemistry, drinks, inspector̃, juice
- PART: sòː
- PROPN: Feːdar̃al, Gor̃illas
- VERB: checking, escaping, pr̃etending
- X: lìllaːhì, àlhamdù, sùkûːl, Allàː, Riːmiː, bitch, dubuː, fir̃aːmar̃i, huɗu, hù
- Yes
- PartType
- Int
- CCONJ: koː
- NOUN: bàːbâː, gaudâː
- PART: kèːnan, ba, kùwa, fa, kuma, kùwâː, bâː, fâ, neː, bâ
- PRON: wâː
- VERB: shâː, zuwâː
- VERB-Vnoun: zuwâː
- Int
Syntax
Auxiliary Verbs and Copula
- This corpus uses 1 lemmas as copulas (cop). Examples: _.
- This corpus uses 1 lemmas as auxiliaries (aux). Examples: _.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (90)
- VERB--PRON (10)
- VERB--PRON-Nom (1)
- VERB-Vnoun--NOUN (4)
- obj
- VERB--NOUN (384)
- VERB--NOUN-ADP(dà) (1)
- VERB--NOUN-ADP(kân) (1)
- VERB--NOUN-ADP(na~ta) (3)
- VERB--NOUN-ADP(wai) (1)
- VERB--PRON (135)
- VERB--PRON-ADP(dà) (1)
- VERB--PRON-Nom (4)
- VERB-Vnoun--NOUN (17)
- VERB-Vnoun--PRON (8)
- iobj
- VERB--PRON (44)
- VERB--PRON-Dat (92)
- VERB--PRON-Nom (2)
- VERB-Part--PRON (2)
- VERB-Vnoun--PRON (1)
Verbs with Reflexive Core Objects
- This corpus contains 1 lemmas that occur at least once with a reflexive core object (obj or iobj). Examples: yi kânmù
Relations Overview
- This corpus uses 8 relation subtypes: acl:relcl, advcl:cleft, cc:preconj, compound:svc, flat:foreign, flat:name, obl:arg, parataxis:parenth
- The following 5 relation types are not used in this corpus at all: expl, clf, list, orphan, goeswith