Treebank Statistics: UD_Catalan-AnCora: POS Tags: NOUN
There are 7625 NOUN
lemmas (29%), 10041 NOUN
types (27%) and 98646 NOUN
tokens (18%).
Out of 16 observed tags, the rank of NOUN
is: 2 in number of lemmas, 2 in number of types and 1 in number of tokens.
The 10 most frequent NOUN
lemmas: any, milió, pesseta, dia, obra, persona, president, mes, grup, empresa
The 10 most frequent NOUN
types: any, anys, milions, pessetes, president, persones, dia, part, cas, grup
The 10 most frequent ambiguous lemmas: any (NOUN 1780, PROPN 7), dia (NOUN 567, PROPN 3), obra (NOUN 508, PROPN 1), empresa (NOUN 422, PROPN 2), cas (NOUN 407, PROPN 3), fet (NOUN 404, ADJ 59, PROPN 1, VERB 1), projecte (NOUN 360, PROPN 2), hora (NOUN 353, PROPN 2), país (NOUN 342, PROPN 3), cap (NOUN 323, DET 215, PRON 23, ADP 7, PROPN 4)
The 10 most frequent ambiguous types: anys (NOUN 868, PROPN 7), dia (NOUN 352, PROPN 3), cas (NOUN 323, PROPN 3), cap (NOUN 288, DET 209, PRON 18, ADP 7, PROPN 4), fet (NOUN 266, VERB 263, ADJ 19, PROPN 1), obra (NOUN 253, PROPN 1), empresa (NOUN 240, PROPN 2), partit (NOUN 231, VERB 2), director (NOUN 228, ADJ 10), temps (NOUN 227, PROPN 5)
- anys
- dia
- cas
- cap
- NOUN 288: A el nostre país no n’ hi ha cap .
- DET 209: Durant la vista oral , cap prova material va incriminar l’ acusada .
- PRON 18: No sóc cap de les dones que se ‘ns proposen com a models socials .
- ADP 7: Els qui som una mica més de cap allà ja tenim un altre debat .
- PROPN 4: El disc recupera cançons tradicionals reinventades com la cançó anònima de el segle XIX ‘ Si el rei vol corona ‘ o la cançó de festa ‘ No en volem cap ‘ .
- fet
- NOUN 266: De fet , en va ser el progenitor intel·lectual i emocional .
- VERB 263: També s’ ha fet una ofrena simbòlica a ‘ la Moreneta ‘ .
- ADJ 19: [ Francesc ] Homs ha dit que l’ acord estava gairebé fet .
- PROPN 1: Només cal posar se a les portes de qualsevol tribunal de qualsevol lloc de el món per confirmar que , almenys una vegada a el dia , algú surt de l’ edifici exclamant amb satisfacció la frase : “ S’ ha fet justícia “ .
- obra
- NOUN 253: Una besnéta de Clarín descobreix la primera obra teatral de l’ autor .
- PROPN 1: L’ exposició ‘ La Catedral de Girona . L’ obra de la seu ‘ s’ estructura en quatre parts , que fan referència a les transformacions que a el llarg de més de mil anys , van portar de el temple romà a l’ església romànica , de la catedral gòtica a l’ edifici actual .
- empresa
- NOUN 240: L’ empresa ja havia acomiadat 8.000 persones abans d’ anunciar el pla .
- PROPN 2: Enguany la UETE ofereix 19 cursos de els quals 5 són de Medi Ambient , 6 d’ Economia d’ empresa , 2 de Salut , 2 d’ Educació , 1 d’ Història , 1 d’ Arquitectura i Urbanisme , 1 de Filologia i 1 de Noves tecnologies .
- partit
- NOUN 231: El partit va prendre cos seguint el camí de els dos primers .
- VERB 2: En opinió de Fernández Santiago aquesta iniciativa és “ l’ assumpte més greu que ha passat en molt temps “ en aquesta matèria , ja que les anteriors iniciatives per a la devolució de els lligalls havien partit de CiU , mentre que aquesta neix de el PSC .
- director
- temps
Morphology
The form / lemma ratio of NOUN
is 1.316852 (the average of all parts of speech is 1.416814).
The 1st highest number of forms (5) was observed with the lemma “metre”: m, m., m.36, metre, metres.
The 2nd highest number of forms (5) was observed with the lemma “pesseta”: PTA, pesseta, pessetes, pta., ptes..
The 3rd highest number of forms (3) was observed with the lemma “article”: ART, article, articles.
NOUN
occurs with 3 features: Number (89723; 91% instances), Gender (86128; 87% instances), Foreign (9; 0% instances)
NOUN
occurs with 5 feature-value pairs: Foreign=Yes
, Gender=Fem
, Gender=Masc
, Number=Plur
, Number=Sing
NOUN
occurs with 10 feature combinations.
The most frequent feature combination is Gender=Fem|Number=Sing
(30486 tokens).
Examples: obra, empresa, llei, ciutat, zona, cosa, situació, banda, manera, setmana
Relations
NOUN
nodes are attached to their parents using 28 different relations: nmod (28395; 29% instances), obj (16481; 17% instances), obl (16100; 16% instances), nsubj (14722; 15% instances), conj (6361; 6% instances), obl:arg (5004; 5% instances), fixed (3298; 3% instances), appos (3141; 3% instances), compound (1454; 1% instances), root (1274; 1% instances), obl:agent (695; 1% instances), ccomp (348; 0% instances), case (342; 0% instances), acl (241; 0% instances), advcl (211; 0% instances), xcomp (129; 0% instances), advmod (128; 0% instances), mark (102; 0% instances), cc (47; 0% instances), parataxis (47; 0% instances), dep (35; 0% instances), csubj (30; 0% instances), acl:relcl (27; 0% instances), nsubj:pass (20; 0% instances), nsubj:outer (7; 0% instances), flat (4; 0% instances), nummod (2; 0% instances), dislocated (1; 0% instances)
Parents of NOUN
nodes belong to 15 different parts of speech: VERB (48453; 49% instances), NOUN (34947; 35% instances), ADJ (4431; 4% instances), ADP (2839; 3% instances), PROPN (1852; 2% instances), NUM (1346; 1% instances), (1274; 1% instances), ADV (1241; 1% instances), PRON (816; 1% instances), AUX (447; 0% instances), DET (401; 0% instances), SYM (216; 0% instances), CCONJ (156; 0% instances), SCONJ (135; 0% instances), PART (92; 0% instances)
5587 (6%) NOUN
nodes are leaves.
21798 (22%) NOUN
nodes have one child.
30033 (30%) NOUN
nodes have two children.
41228 (42%) NOUN
nodes have three or more children.
The highest child degree of a NOUN
node is 18.
Children of NOUN
nodes are attached using 32 different relations: det (68344; 29% instances), case (51292; 22% instances), nmod (35106; 15% instances), amod (23549; 10% instances), punct (14285; 6% instances), acl (8274; 4% instances), conj (6364; 3% instances), appos (6251; 3% instances), cc (5627; 2% instances), nummod (4661; 2% instances), cop (2115; 1% instances), advmod (1867; 1% instances), nsubj (1461; 1% instances), compound (1241; 1% instances), mark (1115; 0% instances), fixed (775; 0% instances), obl (481; 0% instances), aux (365; 0% instances), advcl (252; 0% instances), parataxis (81; 0% instances), csubj (67; 0% instances), obl:arg (39; 0% instances), obj (38; 0% instances), dep (21; 0% instances), acl:relcl (16; 0% instances), flat (12; 0% instances), ccomp (4; 0% instances), expl:pass (3; 0% instances), advmod:emph (1; 0% instances), dislocated (1; 0% instances), nsubj:outer (1; 0% instances), xcomp (1; 0% instances)
Children of NOUN
nodes belong to 15 different parts of speech: DET (68549; 29% instances), ADP (51045; 22% instances), NOUN (34947; 15% instances), ADJ (24015; 10% instances), PUNCT (14285; 6% instances), PROPN (13485; 6% instances), VERB (8401; 4% instances), NUM (6041; 3% instances), CCONJ (5334; 2% instances), AUX (2540; 1% instances), ADV (2118; 1% instances), SCONJ (1814; 1% instances), PRON (927; 0% instances), SYM (185; 0% instances), PART (24; 0% instances)