home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Italian-VIT: POS Tags: NUM

There are 1301 NUM lemmas (7%), 1320 NUM types (5%) and 6393 NUM tokens (2%). Out of 17 observed tags, the rank of NUM is: 5 in number of lemmas, 5 in number of types and 12 in number of tokens.

The 10 most frequent NUM lemmas: due, tre, cento, 15, 1, 5, 1973, 2, 20, 30

The 10 most frequent NUM types: due, tre, cento, 15, 1, 5, 1973, 2, 20, 30

The 10 most frequent ambiguous lemmas: due (NUM 338, NOUN 2), cento (NUM 151, NOUN 1), 15 (NUM 117, NOUN 1), 1 (NUM 109, PROPN 3, NOUN 1), 5 (NUM 96, NOUN 1), 1973 (NUM 93, NOUN 1), 2 (NUM 93, NOUN 2), 20 (NUM 90, NOUN 1), 30 (NUM 85, ADJ 1), 3 (NUM 83, NOUN 2)

The 10 most frequent ambiguous types: due (NUM 328, NOUN 2), cento (NUM 151, NOUN 1), 15 (NUM 117, NOUN 1), 1 (NUM 109, PROPN 3, NOUN 1), 5 (NUM 96, NOUN 1), 1973 (NUM 93, NOUN 1), 2 (NUM 93, NOUN 2), 20 (NUM 90, NOUN 1), 30 (NUM 85, ADJ 1), 3 (NUM 83, NOUN 2)

Morphology

The form / lemma ratio of NUM is 1.014604 (the average of all parts of speech is 1.502411).

The 1st highest number of forms (4) was observed with the lemma “uno”: un, un’, una, uno.

The 2nd highest number of forms (3) was observed with the lemma “mezzo”: mezz’, mezza, mezzo.

The 3rd highest number of forms (3) was observed with the lemma “terzo”: terza, terzi, terzo.

NUM occurs with 5 features: NumType (6387; 100% instances), Gender (98; 2% instances), Number (97; 2% instances), Definite (1; 0% instances), PronType (1; 0% instances)

NUM occurs with 8 feature-value pairs: Definite=Ind, Gender=Fem, Gender=Masc, NumType=Card, NumType=Range, Number=Plur, Number=Sing, PronType=Art

NUM occurs with 8 feature combinations. The most frequent feature combination is NumType=Card (6259 tokens). Examples: due, tre, cento, 15, 1, 5, 1973, 2, 20, 30

Relations

NUM nodes are attached to their parents using 19 different relations: nummod (4594; 72% instances), nmod (499; 8% instances), obl (430; 7% instances), flat (320; 5% instances), conj (269; 4% instances), appos (75; 1% instances), obj (69; 1% instances), nsubj (50; 1% instances), root (30; 0% instances), flat:name (17; 0% instances), compound (11; 0% instances), nsubj:pass (11; 0% instances), parataxis (9; 0% instances), ccomp (3; 0% instances), acl:relcl (2; 0% instances), advcl (1; 0% instances), dislocated (1; 0% instances), obl:agent (1; 0% instances), xcomp (1; 0% instances)

Parents of NUM nodes belong to 12 different parts of speech: NOUN (4118; 64% instances), VERB (713; 11% instances), NUM (639; 10% instances), SYM (521; 8% instances), PROPN (225; 4% instances), ADJ (87; 1% instances), (30; 0% instances), PRON (28; 0% instances), ADV (19; 0% instances), X (11; 0% instances), ADP (1; 0% instances), AUX (1; 0% instances)

4047 (63%) NUM nodes are leaves.

870 (14%) NUM nodes have one child.

856 (13%) NUM nodes have two children.

620 (10%) NUM nodes have three or more children.

The highest child degree of a NUM node is 27.

Children of NUM nodes are attached using 26 different relations: case (1100; 23% instances), det (1035; 21% instances), flat (734; 15% instances), punct (567; 12% instances), nmod (396; 8% instances), conj (225; 5% instances), advmod (209; 4% instances), cc (173; 4% instances), nummod (160; 3% instances), amod (74; 2% instances), cop (44; 1% instances), nsubj (42; 1% instances), appos (33; 1% instances), obl (17; 0% instances), acl:relcl (14; 0% instances), advcl (11; 0% instances), aux (6; 0% instances), ccomp (5; 0% instances), mark (5; 0% instances), parataxis (5; 0% instances), compound (3; 0% instances), obj (2; 0% instances), acl (1; 0% instances), det:poss (1; 0% instances), discourse (1; 0% instances), flat:name (1; 0% instances)

Children of NUM nodes belong to 16 different parts of speech: ADP (1070; 22% instances), DET (1036; 21% instances), NOUN (862; 18% instances), NUM (639; 13% instances), PUNCT (567; 12% instances), ADV (246; 5% instances), CCONJ (174; 4% instances), ADJ (87; 2% instances), AUX (50; 1% instances), PROPN (47; 1% instances), VERB (36; 1% instances), PRON (26; 1% instances), SYM (14; 0% instances), SCONJ (5; 0% instances), X (4; 0% instances), INTJ (1; 0% instances)