Treebank Statistics: UD_Arabic-NYUAD: POS Tags: X
There are 36 X
lemmas (1%), 1 X
types (6%) and 927 X
tokens (0%).
Out of 16 observed tags, the rank of X
is: 4 in number of lemmas, 16 in number of types and 15 in number of tokens.
The 10 most frequent X
lemmas: _، typo، TBupdate، the، EADS، b، in، w، &Cx0b، &QC
The 10 most frequent X
types: _
The 10 most frequent ambiguous lemmas: _ (NOUN 221327, PUNCT 71973, ADJ 68841, ADP 62617, VERB 55127, PROPN 48391, ADV 23955, SCONJ 15652, NUM 15105, PRON 12926, AUX 6881, DET 6354, CCONJ 3889, PART 1501, X 380, INTJ 56), typo (X 317, ADP 1), TBupdate (NOUN 408, ADJ 340, VERB 268, X 190, PROPN 69, PUNCT 15, ADP 1, SCONJ 1), b (ADP 12334, NOUN 21, DET 2, PRON 2, SCONJ 2, X 2, ADJ 1, VERB 1), w (CCONJ 43819, SCONJ 235, ADP 42, NOUN 41, VERB 40, ADJ 14, PRON 12, PROPN 9, DET 4, PART 3, NUM 2, PUNCT 2, X 2), F (NOUN 2, X 1), l (ADP 15628, PART 165, NOUN 29, SCONJ 28, ADV 2, VERB 2, ADJ 1, DET 1, NUM 1, PROPN 1, PUNCT 1, X 1), s (AUX 2274, VERB 7, NOUN 1, X 1)
The 10 most frequent ambiguous types: _ (NOUN 221899, ADP 91743, PUNCT 75266, ADJ 69355, PROPN 57421, VERB 55469, CCONJ 49161, PRON 43495, ADV 24067, SCONJ 16614, NUM 15377, AUX 9155, DET 6363, PART 2521, X 927, INTJ 56)
- _
- NOUN 221899: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- ADP 91743: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- PUNCT 75266: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- ADJ 69355: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- PROPN 57421: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- VERB 55469: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- CCONJ 49161: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- PRON 43495: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- ADV 24067: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- SCONJ 16614: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- NUM 15377: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- AUX 9155: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- DET 6363: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- PART 2521: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- X 927: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
- INTJ 56: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Morphology
The form / lemma ratio of X
is 0.027778 (the average of all parts of speech is 0.003044).
The 1st highest number of forms (1) was observed with the lemma “&Cx0b”: _.
The 2nd highest number of forms (1) was observed with the lemma “&QC”: _.
The 3rd highest number of forms (1) was observed with the lemma “&UR”: _.
X
occurs with 9 features: Gender (482; 52% instances), Number (482; 52% instances), Definite (281; 30% instances), Person (205; 22% instances), Voice (205; 22% instances), Mood (197; 21% instances), Case (38; 4% instances), AdpType (2; 0% instances), Polarity (2; 0% instances)
X
occurs with 21 feature-value pairs: AdpType=Prep
, Case=Acc
, Case=Gen
, Case=Nom
, Definite=Com
, Definite=Def
, Definite=Ind
, Gender=Fem
, Gender=Masc
, Mood=Ind
, Mood=Jus
, Mood=Sub
, Number=Dual
, Number=Plur
, Number=Sing
, Person=1
, Person=2
, Person=3
, Polarity=Neg
, Voice=Act
, Voice=Pass
X
occurs with 47 feature combinations.
The most frequent feature combination is _
(439 tokens).
Examples: _
Relations
X
nodes are attached to their parents using 9 different relations: nmod (653; 70% instances), obj (106; 11% instances), nmod:poss (93; 10% instances), iobj (22; 2% instances), nsubj (21; 2% instances), mark (19; 2% instances), root (11; 1% instances), acl (1; 0% instances), xcomp (1; 0% instances)
Parents of X
nodes belong to 11 different parts of speech: NOUN (347; 37% instances), VERB (346; 37% instances), ADV (50; 5% instances), ADJ (49; 5% instances), PROPN (35; 4% instances), PRON (33; 4% instances), X (30; 3% instances), NUM (15; 2% instances), (11; 1% instances), DET (7; 1% instances), CCONJ (4; 0% instances)
607 (65%) X
nodes are leaves.
156 (17%) X
nodes have one child.
100 (11%) X
nodes have two children.
64 (7%) X
nodes have three or more children.
The highest child degree of a X
node is 19.
Children of X
nodes are attached using 17 different relations: nmod (203; 32% instances), obj (114; 18% instances), case (76; 12% instances), punct (45; 7% instances), amod (31; 5% instances), ccomp (26; 4% instances), nummod (24; 4% instances), cc (21; 3% instances), xcomp (17; 3% instances), mark (16; 3% instances), advmod (15; 2% instances), iobj (15; 2% instances), cop (11; 2% instances), nsubj (9; 1% instances), nmod:poss (5; 1% instances), aux (1; 0% instances), det (1; 0% instances)
Children of X
nodes belong to 15 different parts of speech: NOUN (228; 36% instances), ADP (76; 12% instances), PROPN (45; 7% instances), PUNCT (45; 7% instances), VERB (43; 7% instances), ADJ (32; 5% instances), PRON (32; 5% instances), X (30; 5% instances), NUM (25; 4% instances), ADV (22; 3% instances), CCONJ (21; 3% instances), SCONJ (13; 2% instances), AUX (12; 2% instances), PART (4; 1% instances), DET (2; 0% instances)