home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Icelandic-IcePaHC: POS Tags: NUM

There are 445 NUM lemmas (1%), 543 NUM types (1%) and 4412 NUM tokens (0%). Out of 16 observed tags, the rank of NUM is: 7 in number of lemmas, 8 in number of types and 14 in number of tokens.

The 10 most frequent NUM lemmas: tveir, þrír, fjórir, fimm, tólf, sex, sjö, tíu, hálfur, hundrað

The 10 most frequent NUM types: tveir, tólf, tvo, fimm, tvö, sex, þrír, þrjú, sjö, þrjá

The 10 most frequent ambiguous lemmas: tveir (NUM 811, ADV 16, NOUN 4, ADJ 2, PROPN 1), þrír (NUM 503, ADV 7, NOUN 5, ADJ 3), fjórir (NUM 266, ADJ 4, ADV 3, NOUN 1), fimm (NUM 203, ADV 3, ADJ 1, NOUN 1), tólf (NUM 194, ADV 2, NOUN 2, X 1), sex (NUM 175, NOUN 4, X 2, ADV 1), sjö (NUM 120, ADJ 2, ADV 1, NOUN 1), tíu (NUM 115, ADJ 3, NOUN 3, ADV 1), hálfur (NUM 106, ADJ 62, DET 1), hundrað (NOUN 122, NUM 91)

The 10 most frequent ambiguous types: tveir (NUM 148, ADV 10), tólf (NUM 156, ADV 2), fimm (NUM 148, ADV 3), tvö (NUM 132, ADV 5), sex (NUM 117, ADV 1, NOUN 1), þrír (NUM 98, ADV 5), sjö (NUM 88, ADV 1), tvær (NUM 78, ADV 1), tíu (NUM 79, ADJ 1, ADV 1), átta (NUM 58, VERB 14, ADJ 13, ADV 3)

Morphology

The form / lemma ratio of NUM is 1.220225 (the average of all parts of speech is 1.856953).

The 1st highest number of forms (20) was observed with the lemma “hvortveggja”: hverirtveggju, hvorirtveggi, hvorirtveggja, hvorirtveggju, hvorntveggja, hvorratveggja, hvorritveggju, hvorrtveggi, hvorrtveggja, hvorstveggja, hvorttveggja, hvortveggi, hvortveggja, hvorumtveggja, hvorumtveggju, hvorumtveggjum, hvorutveggi, hvorutveggja, hvorutveggju, hvorutveggjum.

The 2nd highest number of forms (15) was observed with the lemma “þrír”: 3, 3., iii, iiij, iij, þrem, þremur, þriggja, þrim, þrimur, þrjá, þrjár, þrjú, þrí, þrír.

The 3rd highest number of forms (13) was observed with the lemma “tveir”: 2, 2., ii, ij, ijur, tveggja, tveim, tveimur, tveir, tvo, tvær, tví-, tvö.

NUM occurs with 13 features: NumType (3438; 78% instances), Case (2748; 62% instances), Number (2738; 62% instances), Gender (2726; 62% instances), Definite (414; 9% instances), Degree (214; 5% instances), Foreign (51; 1% instances), PronType (44; 1% instances), Mood (5; 0% instances), Person (5; 0% instances), Tense (5; 0% instances), VerbForm (5; 0% instances), Voice (5; 0% instances)

NUM occurs with 28 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Definite=Def, Definite=Ind, Degree=Cmp, Degree=Pos, Degree=Sup, Foreign=Yes, Gender=Fem, Gender=Masc, Gender=Neut, Mood=Ind, Mood=Sub, NumType=Card, NumType=Frac, NumType=Ord, Number=Plur, Number=Sing, Person=3, PronType=Dem, PronType=Ind, PronType=Int, PronType=Prs, Tense=Pres, VerbForm=Fin, Voice=Act

NUM occurs with 104 feature combinations. The most frequent feature combination is NumType=Card (1099 tokens). Examples: tólf, tveir, tvo, fimm, sex, tvö, 3, þrír, 2, sjö

Relations

NUM nodes are attached to their parents using 17 different relations: nummod (3237; 73% instances), obl (378; 9% instances), root (197; 4% instances), nsubj (129; 3% instances), conj (106; 2% instances), obj (88; 2% instances), dep (86; 2% instances), advcl (49; 1% instances), appos (38; 1% instances), xcomp (34; 1% instances), amod (31; 1% instances), ccomp (15; 0% instances), acl:relcl (8; 0% instances), nmod:poss (8; 0% instances), iobj (6; 0% instances), acl (1; 0% instances), nmod (1; 0% instances)

Parents of NUM nodes belong to 14 different parts of speech: NOUN (2701; 61% instances), VERB (575; 13% instances), PROPN (242; 5% instances), NUM (238; 5% instances), (197; 4% instances), X (158; 4% instances), PRON (87; 2% instances), DET (78; 2% instances), ADJ (58; 1% instances), ADV (53; 1% instances), AUX (16; 0% instances), CCONJ (4; 0% instances), ADP (3; 0% instances), PART (2; 0% instances)

3104 (70%) NUM nodes are leaves.

783 (18%) NUM nodes have one child.

327 (7%) NUM nodes have two children.

198 (4%) NUM nodes have three or more children.

The highest child degree of a NUM node is 14.

Children of NUM nodes are attached using 25 different relations: punct (544; 25% instances), conj (347; 16% instances), cc (212; 10% instances), obl (200; 9% instances), nummod (198; 9% instances), case (120; 5% instances), det (81; 4% instances), amod (77; 4% instances), advmod (72; 3% instances), mark (70; 3% instances), cop (66; 3% instances), nmod:poss (57; 3% instances), nsubj (55; 3% instances), acl:relcl (34; 2% instances), nmod (17; 1% instances), dep (12; 1% instances), advcl (8; 0% instances), appos (8; 0% instances), compound:prt (5; 0% instances), xcomp (5; 0% instances), flat:foreign (3; 0% instances), acl (2; 0% instances), ccomp (2; 0% instances), aux (1; 0% instances), parataxis (1; 0% instances)

Children of NUM nodes belong to 15 different parts of speech: PUNCT (544; 25% instances), NOUN (369; 17% instances), NUM (238; 11% instances), CCONJ (213; 10% instances), VERB (168; 8% instances), ADP (125; 6% instances), PRON (121; 6% instances), ADV (82; 4% instances), DET (80; 4% instances), AUX (70; 3% instances), SCONJ (69; 3% instances), ADJ (56; 3% instances), PROPN (51; 2% instances), X (10; 0% instances), PART (1; 0% instances)