home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Sanskrit-Vedic: POS Tags: NOUN

There are 6396 NOUN lemmas (46%), 15108 NOUN types (41%) and 72315 NOUN tokens (35%). Out of 13 observed tags, the rank of NOUN is: 1 in number of lemmas, 1 in number of types and 1 in number of tokens.

The 10 most frequent NOUN lemmas: agni, deva, indra, yajña, brahman, loka, ap, paśu, prāṇa, soma

The 10 most frequent NOUN types: _, agniḥ, devāḥ, agnim, agne, brahma, indra, indraḥ, deva, āpaḥ

The 10 most frequent ambiguous lemmas: deva (NOUN 1583, ADJ 25), yajña (NOUN 792, ADV 1), kāma (NOUN 419, ADV 5), go (NOUN 350, ADV 8), brāhmaṇa (NOUN 318, ADJ 2), āditya (NOUN 305, ADJ 20), agra (NOUN 264, ADV 2, ADJ 1), ahar (NOUN 244, ADV 1), saṃvatsara (NOUN 235, ADV 2), ratha (NOUN 218, ADV 2)

The 10 most frequent ambiguous types: _ (NOUN 2255, ADJ 407, CCONJ 331, SCONJ 216, NUM 186, PRON 124, VERB 106, INTJ 86, ADP 73, ADV 54, DET 14), devāḥ (NOUN 466, ADJ 1), deva (NOUN 253, ADJ 1), yajñam (NOUN 186, ADV 1), namaḥ (NOUN 162, VERB 1), devān (NOUN 131, ADJ 1), karma (NOUN 127, VERB 1), āyuḥ (NOUN 127, ADJ 1), jyotiḥ (NOUN 119, ADV 2), agre (NOUN 107, ADV 16)

Morphology

The form / lemma ratio of NOUN is 2.362101 (the average of all parts of speech is 2.674382).

The 1st highest number of forms (22) was observed with the lemma “brahman”: _, brahma, brahmabhiḥ, brahmabhyaḥ, brahman, brahmanaḥ, brahmane, brahmani, brahmanā, brahmaṇas, brahmaṇaḥ, brahmaṇe, brahmaṇi, brahmaṇā, brahmaṇām, brahmā, brahmānam, brahmāṇam, brahmāṇau, brahmāṇaḥ, brahmāṇi, brahmāṇā.

The 2nd highest number of forms (22) was observed with the lemma “deva”: _, deva, devaiḥ, devam, devasya, devau, devayoḥ, devaḥ, deve, devebhiḥ, devebhyaḥ, devena, deveṣu, devā, devān, devānt, devānām, devāsaḥ, devāt, devāya, devāḥ, devāṁ.

The 3rd highest number of forms (21) was observed with the lemma “anta”: _, anta, antaiḥ, antam, antataḥ, antau, antayoḥ, antayā, antaḥ, ante, antebhyaḥ, antena, anteṣu, antā, antābhiḥ, antām, antān, antāni, antāt, antāḥ, antāṁ.

NOUN occurs with 4 features: Case (64911; 90% instances), Gender (64911; 90% instances), Number (64911; 90% instances), Compound (7365; 10% instances)

NOUN occurs with 15 feature-value pairs: Case=Abl, Case=Acc, Case=Dat, Case=Gen, Case=Ins, Case=Loc, Case=Nom, Case=Voc, Compound=Yes, Gender=Fem, Gender=Masc, Gender=Neut, Number=Dual, Number=Plur, Number=Sing

NOUN occurs with 73 feature combinations. The most frequent feature combination is Case=Nom|Gender=Masc|Number=Sing (9869 tokens). Examples: agniḥ, indraḥ, prajāpatiḥ, prāṇaḥ, yajñaḥ, kāmaḥ, ātmā, devaḥ, somaḥ, savitā

Relations

NOUN nodes are attached to their parents using 54 different relations: obj (10396; 14% instances), nsubj (9910; 14% instances), conj (7426; 10% instances), nmod (6686; 9% instances), flat (6050; 8% instances), root (4688; 6% instances), obl (4465; 6% instances), orphan (2104; 3% instances), obl:instr (1857; 3% instances), vocative (1686; 2% instances), obl:goal (1659; 2% instances), compound:coord (1575; 2% instances), obl:lmod (1340; 2% instances), acl (1165; 2% instances), nmod:appos (1029; 1% instances), iobj (967; 1% instances), obl:tmod (895; 1% instances), ccomp (818; 1% instances), obl:manner (728; 1% instances), advcl:ccomp (680; 1% instances), obl:source (675; 1% instances), xcomp (597; 1% instances), advcl (567; 1% instances), acl:relcl (532; 1% instances), advcl:fin (433; 1% instances), obl:soc (427; 1% instances), advcl:manner (416; 1% instances), appos (340; 0% instances), parataxis (286; 0% instances), xcomp:result (274; 0% instances), obl:agent (221; 0% instances), obl:path (190; 0% instances), obl:grad (159; 0% instances), obl:benef (156; 0% instances), amod (125; 0% instances), advcl:cond (118; 0% instances), acl:attr (111; 0% instances), compound (99; 0% instances), csubj (96; 0% instances), advcl:dpct (88; 0% instances), acl:dpct (79; 0% instances), advcl:caus (65; 0% instances), dislocated (59; 0% instances), advcl:tcl (22; 0% instances), nmod:pred (18; 0% instances), compound:name (9; 0% instances), discourse (8; 0% instances), acl:crel (4; 0% instances), advcl:concess (4; 0% instances), case (4; 0% instances), ccomp:rel (4; 0% instances), acl:ptcp (2; 0% instances), advcl:lcl (2; 0% instances), fixed (1; 0% instances)

Parents of NOUN nodes belong to 12 different parts of speech: VERB (35168; 49% instances), NOUN (22956; 32% instances), ADJ (4917; 7% instances), (4688; 6% instances), PRON (3000; 4% instances), ADV (597; 1% instances), NUM (538; 1% instances), ADP (178; 0% instances), PART (123; 0% instances), INTJ (59; 0% instances), CCONJ (53; 0% instances), SCONJ (38; 0% instances)

31340 (43%) NOUN nodes are leaves.

28307 (39%) NOUN nodes have one child.

8109 (11%) NOUN nodes have two children.

4559 (6%) NOUN nodes have three or more children.

The highest child degree of a NOUN node is 30.

Children of NOUN nodes are attached using 62 different relations: nmod (8886; 14% instances), conj (6685; 11% instances), flat (6494; 11% instances), det (4698; 8% instances), nsubj (4490; 7% instances), amod (4473; 7% instances), acl (3770; 6% instances), orphan (2674; 4% instances), discourse (2661; 4% instances), cc (2424; 4% instances), mark (1929; 3% instances), compound:coord (1588; 3% instances), nummod (1383; 2% instances), advmod (1370; 2% instances), case (1267; 2% instances), case:sim (844; 1% instances), nmod:appos (813; 1% instances), cop (637; 1% instances), obj (473; 1% instances), mark:sim (365; 1% instances), appos (350; 1% instances), acl:relcl (340; 1% instances), parataxis (328; 1% instances), acl:dpct (282; 0% instances), ccomp (272; 0% instances), acl:attr (229; 0% instances), obl (190; 0% instances), vocative (180; 0% instances), advcl (165; 0% instances), acl:ptcp (163; 0% instances), advcl:cond (143; 0% instances), csubj (123; 0% instances), compound (100; 0% instances), iobj (81; 0% instances), obl:tmod (80; 0% instances), obl:lmod (79; 0% instances), compound:name (47; 0% instances), advcl:tcl (45; 0% instances), obl:soc (43; 0% instances), advcl:caus (40; 0% instances), obl:manner (39; 0% instances), obl:benef (37; 0% instances), advcl:fin (36; 0% instances), obl:source (34; 0% instances), obl:instr (32; 0% instances), obl:goal (25; 0% instances), advcl:manner (21; 0% instances), dislocated (18; 0% instances), nmod:pred (18; 0% instances), obl:agent (13; 0% instances), obl:grad (8; 0% instances), advcl:ccomp (6; 0% instances), advcl:dpct (6; 0% instances), xcomp (5; 0% instances), acl:crel (4; 0% instances), acl:cont (3; 0% instances), ccomp:rel (3; 0% instances), obl:path (3; 0% instances), advcl:lcl (2; 0% instances), acl:pred (1; 0% instances), aux (1; 0% instances), fixed (1; 0% instances)

Children of NOUN nodes belong to 13 different parts of speech: NOUN (22956; 37% instances), PRON (10350; 17% instances), ADJ (7782; 13% instances), PART (6467; 11% instances), VERB (5182; 8% instances), ADV (2475; 4% instances), CCONJ (2325; 4% instances), NUM (1557; 3% instances), ADP (872; 1% instances), AUX (638; 1% instances), DET (526; 1% instances), INTJ (200; 0% instances), SCONJ (190; 0% instances)