home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Icelandic-GC: POS Tags: VERB

There are 1229 VERB lemmas (8%), 2867 VERB types (13%) and 12840 VERB tokens (13%). Out of 17 observed tags, the rank of VERB is: 4 in number of lemmas, 3 in number of types and 2 in number of tokens.

The 10 most frequent VERB lemmas: vera, segja, koma, verða, fara, hafa, gera, eiga, taka, fá

The 10 most frequent VERB types: er, segir, var, eru, sagði, hafa, kemur, gera, verið, koma

The 10 most frequent ambiguous lemmas: vera (AUX 2480, VERB 1353, X 5, ADV 2, NOUN 1), segja (VERB 750, ADV 7), koma (VERB 521, NOUN 3, ADJ 1), verða (VERB 379, AUX 75), fara (VERB 338, NOUN 2, ADV 1), hafa (AUX 1150, VERB 333, ADV 1, NOUN 1), gera (VERB 309, ADJ 1), eiga (VERB 298, ADP 181, ADV 13, NOUN 12, PROPN 2, ADJ 1), taka (VERB 298, NOUN 3), (VERB 221, AUX 5, NOUN 1)

The 10 most frequent ambiguous types: er (AUX 933, VERB 546, SCONJ 30, ADV 6, X 3), segir (VERB 451, ADV 2), var (AUX 498, VERB 187, X 1), eru (AUX 211, VERB 165), sagði (VERB 145, ADV 1), hafa (AUX 263, VERB 148), verið (AUX 266, VERB 108, X 3, NOUN 1), verður (VERB 101, AUX 26), (AUX 168, VERB 98), á (ADP 2211, ADV 238, VERB 85, SCONJ 12, NOUN 6, X 3, PROPN 2, PART 1)

Morphology

The form / lemma ratio of VERB is 2.332791 (the average of all parts of speech is 1.434754).

The 1st highest number of forms (25) was observed with the lemma “taka”: Tækist, taka, takast, taki, takist, takið, taktu, tek, tekin, tekinn, tekist, tekið, teknar, teknir, tekst, tekur, tæki, tók, tóks, tókst, tóku, tókum, tókust, tökum, tökumst.

The 2nd highest number of forms (22) was observed with the lemma “koma”: kemst, kemur, kom, koma, komandi, komast, komi, komin, kominn, komist, komið, komnar, komnir, komst, komu, komum, komumst, komust, kæmi, kæmist, kæmu, kæmust.

The 3rd highest number of forms (19) was observed with the lemma “vera”: er, ert, ertu, eru, erum, eruð, sé, séu, séum, var, varst, vera, verið, voru, vorum, vægi, væri, væru, værum.

VERB occurs with 8 features: Voice (11474; 89% instances), Number (8606; 67% instances), Mood (7402; 58% instances), Tense (7345; 57% instances), Person (7287; 57% instances), Case (5565; 43% instances), VerbForm (5302; 41% instances), Gender (91; 1% instances)

VERB occurs with 22 feature-value pairs: Case=Acc, Case=Dat, Case=Gen, Case=Nom, Gender=Fem, Gender=Masc, Gender=Neut, Mood=Imp, Mood=Ind, Mood=Sub, Number=Plur, Number=Sing, Person=1, Person=2, Person=3, Tense=Past, Tense=Pres, VerbForm=Inf, VerbForm=Part, VerbForm=Sup, Voice=Act, Voice=Mid

VERB occurs with 209 feature combinations. The most frequent feature combination is Mood=Ind|Number=Sing|Person=3|Tense=Pres|Voice=Act (1395 tokens). Examples: er, segir, kemur, verður, á, fer, þarf, hefur, bendir, tekur

Relations

VERB nodes are attached to their parents using 17 different relations: root (3975; 31% instances), xcomp (2180; 17% instances), conj (1942; 15% instances), acl:relcl (1334; 10% instances), advcl (952; 7% instances), ccomp (898; 7% instances), obl (850; 7% instances), obj (340; 3% instances), dep (185; 1% instances), nsubj (153; 1% instances), iobj (15; 0% instances), amod (5; 0% instances), case (4; 0% instances), csubj (3; 0% instances), acl (2; 0% instances), nmod:poss (1; 0% instances), parataxis (1; 0% instances)

Parents of VERB nodes belong to 14 different parts of speech: VERB (5349; 42% instances), (3975; 31% instances), NOUN (1561; 12% instances), PRON (679; 5% instances), ADJ (628; 5% instances), ADV (296; 2% instances), PROPN (138; 1% instances), SCONJ (131; 1% instances), NUM (34; 0% instances), ADP (18; 0% instances), CCONJ (15; 0% instances), PART (10; 0% instances), X (5; 0% instances), AUX (1; 0% instances)

689 (5%) VERB nodes are leaves.

807 (6%) VERB nodes have one child.

2389 (19%) VERB nodes have two children.

8955 (70%) VERB nodes have three or more children.

The highest child degree of a VERB node is 15.

Children of VERB nodes are attached using 26 different relations: nsubj (7110; 16% instances), obl (7042; 16% instances), advmod (5436; 12% instances), mark (5046; 11% instances), obj (4342; 10% instances), punct (4203; 9% instances), cc (1928; 4% instances), conj (1892; 4% instances), xcomp (1734; 4% instances), aux (1549; 3% instances), cop (1018; 2% instances), advcl (763; 2% instances), case (713; 2% instances), ccomp (484; 1% instances), compound:prt (406; 1% instances), iobj (381; 1% instances), acl:relcl (266; 1% instances), dep (200; 0% instances), nmod (184; 0% instances), expl (121; 0% instances), amod (14; 0% instances), nummod (4; 0% instances), parataxis (4; 0% instances), nmod:poss (3; 0% instances), csubj (2; 0% instances), flat:foreign (1; 0% instances)

Children of VERB nodes belong to 17 different parts of speech: NOUN (11506; 26% instances), ADV (5998; 13% instances), VERB (5349; 12% instances), PRON (4252; 9% instances), PUNCT (4203; 9% instances), SCONJ (3042; 7% instances), AUX (2570; 6% instances), PROPN (2062; 5% instances), CCONJ (2049; 5% instances), PART (1990; 4% instances), ADJ (839; 2% instances), ADP (747; 2% instances), NUM (189; 0% instances), X (40; 0% instances), INTJ (4; 0% instances), SYM (4; 0% instances), DET (2; 0% instances)