home edit page issue tracker

This page pertains to UD version 2.

Treebank Statistics: UD_Hebrew-HTB: POS Tags: DET

There are 21 DET lemmas (0%), 26 DET types (0%) and 17281 DET tokens (11%). Out of 15 observed tags, the rank of DET is: 11 in number of lemmas, 12 in number of types and 4 in number of tokens.

The 10 most frequent DET lemmas: ה, כול, כמה, רוב, _, הרבה, שום, מספר, אף, שאר

The 10 most frequent DET types: ה, ה_, כל, כמה, רוב, הרבה, שום, מספר, אף, שאר

The 10 most frequent ambiguous lemmas: ה (DET 16370, SCONJ 776, X 16), כול (DET 520, NOUN 31, ADV 1), רוב (DET 34, NOUN 15), _ (NOUN 365, VERB 326, ADJ 230, ADV 192, AUX 169, CCONJ 109, X 76, PRON 57, SCONJ 46, DET 33), הרבה (DET 33, ADV 21, VERB 13), שום (DET 33, NOUN 4, PROPN 1), מספר (NOUN 38, DET 31), אף (ADV 58, CCONJ 38, DET 20, NOUN 13), שאר (NOUN 20, DET 16), מרבית (DET 14, NOUN 2)

The 10 most frequent ambiguous types: ה (DET 13486, SCONJ 775, X 9), ה_ (DET 2910, X 7, SCONJ 1), רוב (DET 34, NOUN 10), הרבה (DET 33, ADV 21, VERB 3), שום (DET 33, NOUN 4, PROPN 1), מספר (DET 31, NOUN 30, VERB 6), אף (ADV 72, CCONJ 38, DET 20, NOUN 12), שאר (NOUN 19, DET 16), מרבית (DET 14, NOUN 1), מחצית (NOUN 22, DET 11, X 1)

Morphology

The form / lemma ratio of DET is 1.238095 (the average of all parts of speech is 1.702584).

The 1st highest number of forms (5) was observed with the lemma “”: אילו, ה, מחצית, מירב, מרבה.

The 2nd highest number of forms (2) was observed with the lemma “איזה”: איזה, איזו.

The 3rd highest number of forms (2) was observed with the lemma “ה”: ה, ה_.

DET occurs with 3 features: PronType (16396; 95% instances), Definite (885; 5% instances), Gender (18; 0% instances)

DET occurs with 3 feature-value pairs: Definite=Cons, Gender=Masc, PronType=Art

DET occurs with 3 feature combinations. The most frequent feature combination is PronType=Art (16396 tokens). Examples: ה, ה_

Relations

DET nodes are attached to their parents using 16 different relations: det (17005; 98% instances), dep (140; 1% instances), fixed (40; 0% instances), advmod (39; 0% instances), obl (10; 0% instances), advcl (9; 0% instances), compound:smixut (8; 0% instances), nsubj (7; 0% instances), obj (6; 0% instances), root (4; 0% instances), amod (3; 0% instances), nmod:poss (3; 0% instances), appos (2; 0% instances), conj (2; 0% instances), nsubj:cop (2; 0% instances), parataxis (1; 0% instances)

Parents of DET nodes belong to 12 different parts of speech: NOUN (13078; 76% instances), ADJ (3060; 18% instances), NUM (354; 2% instances), VERB (278; 2% instances), PRON (233; 1% instances), PROPN (180; 1% instances), ADV (59; 0% instances), ADP (17; 0% instances), X (12; 0% instances), DET (5; 0% instances), (4; 0% instances), AUX (1; 0% instances)

17202 (100%) DET nodes are leaves.

30 (0%) DET nodes have one child.

31 (0%) DET nodes have two children.

18 (0%) DET nodes have three or more children.

The highest child degree of a DET node is 20.

Children of DET nodes are attached using 17 different relations: dep (59; 30% instances), punct (45; 23% instances), fixed (40; 20% instances), advmod (15; 8% instances), case (8; 4% instances), flat:name (7; 4% instances), obl (5; 3% instances), case:gen (3; 2% instances), det (3; 2% instances), acl:relcl (2; 1% instances), case:acc (2; 1% instances), cc (2; 1% instances), amod (1; 1% instances), appos (1; 1% instances), compound:smixut (1; 1% instances), conj (1; 1% instances), nsubj (1; 1% instances)

Children of DET nodes belong to 12 different parts of speech: ADV (51; 26% instances), PUNCT (45; 23% instances), PROPN (21; 11% instances), ADP (17; 9% instances), VERB (15; 8% instances), NOUN (14; 7% instances), NUM (14; 7% instances), ADJ (6; 3% instances), DET (5; 3% instances), PRON (3; 2% instances), SCONJ (3; 2% instances), CCONJ (2; 1% instances)