Treebank Statistics: UD_Beja-Autogramm: POS Tags: NOUN
There are 1 NOUN
lemmas (6%), 458 NOUN
types (23%) and 1719 NOUN
tokens (14%).
Out of 16 observed tags, the rank of NOUN
is: 8 in number of lemmas, 2 in number of types and 4 in number of tokens.
The 10 most frequent NOUN
lemmas: _
The 10 most frequent NOUN
types: tak, doːr, ʔar, mhiːn, naː, na, dhaj, kaːm, lhaweː, ʔoːr
The 10 most frequent ambiguous lemmas: _ (VERB 2410, PUNCT 2363, DET 1737, NOUN 1719, PRON 820, ADP 766, SCONJ 592, CCONJ 338, PART 321, AUX 284, ADV 191, ADJ 149, X 73, INTJ 66, PROPN 63, NUM 59)
The 10 most frequent ambiguous types: naː (NOUN 47, PRON 5), dhaj (NOUN 32, ADP 1), =na (NOUN 18, PART 2), wari (NOUN 5, ADJ 2), mʔa (NOUN 4, VERB 4), raw (ADJ 6, NOUN 4), ndi (NOUN 3, AUX 2), kna (PRON 32, NOUN 2), nafs (NOUN 2, PRON 2), ʔani (NOUN 2, PRON 2)
- naː
- dhaj
- =na
- wari
- mʔa
- raw
- ndi
- kna
- nafs
- ʔani
Morphology
The form / lemma ratio of NOUN
is 458.000000 (the average of all parts of speech is 126.812500).
The 1st highest number of forms (458) was observed with the lemma “_”: =na, =naː, alla, allaː, allaːj, allaːji, aːmas, babʔa, baji, balad, balami, bani, baraːm, baxit, baʃar, baːb, baːgi, bhali, bhar, bhoː, bilbil, biri, bissa, boːj, bri, buːn, bʔaɖ, bʔaɖaɖ, bʔaɖaɖa, bʔeː, da, daba, damaːn, dammʔara, dangar, dar, dara, darab, dawaːhi, daːj, daːjinaj, daːr, daːsat, deː, dhaj, dhar, dijja, dikʷkʷaːn, diraːr, dirhim, dirʔa, dirʔaː, diwaːn, diːnaːra, doːr, doːra, duːr, dwaːn, dʔam, dʔawaː, faggad, far, faras, fassalat, fatiːra, fatiːraː, faʤil, findikʷ, finʤan, finʤaːn, firaʃ, firha, firkaːk, firʔa, fiːr, gab, gabal, gahwat, gamiːs, ganaj, ganaːj, ganfit, garb, gasir, gaw, gawa, gaɖʔa, gaːdi, gaːrati, gijaːma, ginh, ginha, ginuːf, ginʔ, girab, giriʃ, giriʃa, giriʃaː, girma, giɖʔa, gʷargʷadi, gʷaːb, gʷbi, gʷibi, gʷʔanaːti, hadʔa, hadʔaaːjji, hadʔaːjeː, hagiː, hajawaːna, hajʔa, halak, halaka, halaːgaːj, halaːwaː, halla, hamniː, hamoː, handi, hankʷila, hanʤar, haraːmʤʔoːr, hargʷi, hari, harroː, harʔa, hasir, hataːj, hawat, hawil, hawlijaːj, haɖa, haˈwaːd, haː, haːdoːti, haːl, haːʃ, heːlaj, heːr, heːtaː, hi, hiit, hikuːma, hikuːmaː, his, hiʤ, hoː, hoːb, hoːj, hus, huːri, islaːm, iːjʔa, iːjʔaː, i̠ːjʔaː, jad, jaf, jam, jandiːnanaː, jas, jaːguːt, jaːndiːnanaː, jaːs, jhaːm, jina, jinaː, jiwaːʃi, kalawa, kam, kantuːr, karas, karaːj, karaːma, karaːmaː, kaːm, kiraːj, kiʃja, kjas, kna, koːba, koːlaj, koːma, kʷhaːn, kʷinha, kʷiːkʷʔaːj, lamma, latit, lhaweː, lil, liːgamanaː, liːlaːw, liːli, luːbja, luːl, lʔa, madar, magʷal, majʔa, mana, manan, mangaːj, manniimti, mar, martaba, martabaː, matig, mawaːʔid, maɖam, maʤaʔaː, maːl, mbaːba, mbaːbaː, mbiɖeːj, mbʔaɖ, mbʔi, meːk, meːs, mha, mhallaga, mhawaj, mhiːn, mhiːnaːn, mijaːd, mijaːj, mijʔat, mindikʷijaːj, mirkʷaːj, mirʔafi, misuːs, mittia, miʃʔari, miːlal, miːmaʃa, miːtat, mʔa, mʔakʷara, mʔam, mʔari, mʔariː, mʔaːdami, na, nabi, nafara, nafs, naweː, nawi, naː, naːj, nda, ndeː, ndi, nfʔa, ngirab, nifir, nifʔaː, nihaːs, nijaː, niːwa, noːs, nʔandaː, nʔeː, nʔi, rab, ragad, ragada, rajha, raw, raːw, rba, reːr, reːw, rhat, rhisat, riba, rifkaːk, rizg, sak, sakana, sakʷkʷar, sala, samaːr, sana, saraːt, saroːj, saːlhi, saːri, siganfoːj, sijaːm, sikka, sitoːboːj, siʤin, siːleːl, sjaːm, soːdʔala, soːtʔaːla, suːfa, suːg, suːr, tafsiːl, tak, takat, taktʔi, taktʔiː, talga, tam, taman, tamaʔa, tanaː, tanʔa, tarab, tarabeː, tariːga, tarʤimaːl, taːga, tiji, tijoː, tilʔi, tirig, tiːlal, tji, trig, tʔiit, wadak, wali, walia, wanas, wara, wari, wast, waːsir, waːw, waːʤʤa, weːnaː, wjaː, wʔa, wʔaː, xadaːra, xaddam, xawaːʤa, zamaːn, zirʔa, ɖab, ɖambijaː, ɖa~ɖib, ɖa~ɖibti, ɖaːbanaː, ɖeːfa, ɖhaniːni, ɖiwa, ʃa, ʃabaka, ʃakeː, ʃaki, ʃakʷiːn, ʃamat, ʃanha, ʃartija, ʃawwa, ʃawweː, ʃawwi, ʃawwia, ʃaː, ʃaːbbi, ʃaːk, ʃibibat, ʃiha, ʃinhat, ʃiːtaːn, ʃiːʃik, ʃkaːm, ʃuːk, ʃuːki, ʃʔa, ʃʔaː, ʃʔoːbjaː, ʈibin, ʈiːn, ʈʔa, ʔaba, ʔabaː, ʔabuːk, ʔadeː, ʔadi, ʔadim, ʔafa, ʔagja, ʔaj, ʔajajdhaja, ʔajaːj, ʔaji, ʔala, ʔalaːma, ʔalba, ʔamaːna, ʔamaːr, ʔamma, ʔamuːl, ʔangʷil, ʔangʷiːl, ʔani, ʔankʷana, ʔannuːr, ʔanoː, ʔar, ʔarabijaːj, ʔaraw, ʔarawi, ʔaraːw, ʔard, ʔarit, ʔarːbi, ʔasir, ʔaweː, ʔawi, ʔaʃaj, ʔaː, ʔaːda, ʔaːdeː, ʔaːmanaːj, ʔaːrbi, ʔaːrbiː, ʔaːwi, ʔaːʃoː, ʔeːga, ʔeːgirim, ʔeːgrim, ʔeːtrig, ʔeːɖa, ʔibra, ʔidda, ʔimir, ʔislaːmi, ʔiʃa, ʔiʃat, ʔiʤir, ʔiːbaːb, ʔiːbaːbkina, ʔiːd, ʔoːda, ʔoːr, ʔoːt, ʔoːtanaː, ʤabanaː, ʤanna, ʤaza, ʤaːhila, ʤaːntaːji, ʤhali, ʤhar, ʤhari, ʤimʔa, ʤineːnaː, ʤinis, ʤinsa, ʤoːhar, ʤoːharaaːji, ʤoːharajaːj, ʤoːz.
NOUN
occurs with 5 features: Gender (1396; 81% instances), Number (268; 16% instances), Foreign (26; 2% instances), ExtPos (2; 0% instances), Degree (1; 0% instances)
NOUN
occurs with 8 feature-value pairs: Degree=Dim
, ExtPos=ADV
, Foreign=Yes
, Gender=Fem
, Gender=Masc
, Number=Coll
, Number=Plur
, Number=Sing
NOUN
occurs with 16 feature combinations.
The most frequent feature combination is Gender=Masc
(773 tokens).
Examples: tak, doːr, mhiːn, jhaːm, mijʔat, dhaj, kʷiːkʷʔaːj, gaw, haˈwaːd, ʔar
Relations
NOUN
nodes are attached to their parents using 24 different relations: obj (548; 32% instances), dep:comp (356; 21% instances), nsubj (344; 20% instances), obl:mod (99; 6% instances), dep:conj (56; 3% instances), dislocated:obj (51; 3% instances), nmod (45; 3% instances), reparandum (42; 2% instances), xcomp (41; 2% instances), dislocated:subj (35; 2% instances), root (22; 1% instances), obl:arg (19; 1% instances), vocative (13; 1% instances), ccomp (11; 1% instances), dislocated (10; 1% instances), fixed (6; 0% instances), acl:relcl (5; 0% instances), dep (5; 0% instances), discourse (4; 0% instances), advmod (2; 0% instances), nsubj:outer (2; 0% instances), appos (1; 0% instances), dep:redup (1; 0% instances), parataxis:parenth (1; 0% instances)
Parents of NOUN
nodes belong to 13 different parts of speech: VERB (1149; 67% instances), ADP (332; 19% instances), NOUN (131; 8% instances), SCONJ (36; 2% instances), (22; 1% instances), X (20; 1% instances), AUX (8; 0% instances), PRON (7; 0% instances), ADJ (6; 0% instances), INTJ (3; 0% instances), NUM (2; 0% instances), PROPN (2; 0% instances), PART (1; 0% instances)
171 (10%) NOUN
nodes are leaves.
655 (38%) NOUN
nodes have one child.
456 (27%) NOUN
nodes have two children.
437 (25%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 8.
Children of NOUN
nodes are attached using 30 different relations: det (1465; 47% instances), punct (437; 14% instances), nmod:poss (269; 9% instances), acl:relcl (207; 7% instances), nmod (131; 4% instances), discourse (81; 3% instances), dep (77; 2% instances), cc (76; 2% instances), amod (63; 2% instances), dep:conj (52; 2% instances), advmod (50; 2% instances), cop (39; 1% instances), nummod (28; 1% instances), reparandum (26; 1% instances), acl (24; 1% instances), nsubj (18; 1% instances), dep:comp (13; 0% instances), dislocated:mod (8; 0% instances), aux (5; 0% instances), obj (5; 0% instances), obl:arg (5; 0% instances), vocative (4; 0% instances), dislocated:subj (3; 0% instances), appos (2; 0% instances), fixed (2; 0% instances), dep:redup (1; 0% instances), dislocated:obj (1; 0% instances), obl:mod (1; 0% instances), parataxis (1; 0% instances), parataxis:parenth (1; 0% instances)
Children of NOUN
nodes belong to 16 different parts of speech: DET (1473; 48% instances), PUNCT (437; 14% instances), PRON (295; 10% instances), SCONJ (195; 6% instances), NOUN (131; 4% instances), ADP (98; 3% instances), ADJ (95; 3% instances), CCONJ (76; 2% instances), PART (70; 2% instances), VERB (62; 2% instances), AUX (45; 1% instances), NUM (42; 1% instances), ADV (32; 1% instances), INTJ (21; 1% instances), X (13; 0% instances), PROPN (10; 0% instances)