home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Pashto-Sikaram: POS Tags: NOUN

There are 145 NOUN lemmas (38%), 158 NOUN types (35%) and 232 NOUN tokens (23%). Out of 14 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: ژبه, ژباړه, ارزښت, اثر, خبره, دود, هېواد, وخت, اتره, خلک

The 10 most frequent NOUN types: ژبه, ژبې, ارزښت, ژباړې, اثر, دود, وخت, سیمې, شمېر, نړۍ

The 10 most frequent ambiguous lemmas:

The 10 most frequent ambiguous types: شي (AUX 1, NOUN 1, VERB 1)

Morphology

The form / lemma ratio of NOUN is 1.089655 (the average of all parts of speech is 1.198413).

The 1st highest number of forms (3) was observed with the lemma “لاس”: لاسه, لاسونه, لاسونو.

The 2nd highest number of forms (3) was observed with the lemma “هېواد”: هېواد, هېوادونه, هېوادونو.

The 3rd highest number of forms (3) was observed with the lemma “ژبه”: ژبه, ژبو, ژبې.

NOUN occurs with 4 features: Case (232; 100% instances), Gender (232; 100% instances), Number (232; 100% instances), Typo (2; 1% instances)

NOUN occurs with 9 feature-value pairs: Case=Abl, Case=Acc, Case=Loc, Case=Nom, Gender=Fem, Gender=Masc, Number=Plur, Number=Sing, Typo=Yes

NOUN occurs with 17 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing (60 tokens). Examples: اثر, ارزښت, دود, زر, شمېر, لامل, لیک, لیکدود, موټر, نصاب

Relations

NOUN nodes are attached to their parents using 17 different relations: obl (52; 22% instances), nmod (43; 19% instances), nsubj (43; 19% instances), obj (42; 18% instances), conj (20; 9% instances), root (7; 3% instances), fixed (5; 2% instances), xcomp (5; 2% instances), appos (3; 1% instances), orphan:nsubjobj (3; 1% instances), compound (2; 1% instances), obl:arg (2; 1% instances), acl (1; 0% instances), acl:relcl (1; 0% instances), advcl (1; 0% instances), obl:agent (1; 0% instances), parataxis (1; 0% instances)

Parents of NOUN nodes belong to 6 different parts of speech: VERB (135; 58% instances), NOUN (70; 30% instances), ADJ (16; 7% instances), (7; 3% instances), PROPN (3; 1% instances), PRON (1; 0% instances)

39 (17%) NOUN nodes are leaves.

80 (34%) NOUN nodes have one child.

62 (27%) NOUN nodes have two children.

51 (22%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 9.

Children of NOUN nodes are attached using 22 different relations: case (111; 27% instances), amod (58; 14% instances), nmod (58; 14% instances), det (49; 12% instances), punct (27; 7% instances), cc (18; 4% instances), conj (18; 4% instances), advmod (11; 3% instances), cop (10; 2% instances), nsubj (10; 2% instances), acl:relcl (9; 2% instances), fixed (5; 1% instances), nummod (5; 1% instances), obl (5; 1% instances), mark (3; 1% instances), acl (2; 0% instances), advcl (2; 0% instances), appos (1; 0% instances), aux (1; 0% instances), flat (1; 0% instances), orphan:nsubjobj (1; 0% instances), parataxis (1; 0% instances)

Children of NOUN nodes belong to 14 different parts of speech: ADP (111; 27% instances), NOUN (70; 17% instances), ADJ (60; 15% instances), DET (49; 12% instances), PUNCT (27; 7% instances), CCONJ (18; 4% instances), VERB (16; 4% instances), PROPN (13; 3% instances), PRON (12; 3% instances), AUX (11; 3% instances), ADV (9; 2% instances), NUM (5; 1% instances), SCONJ (3; 1% instances), PART (2; 0% instances)