UD Spanish Sign Language LSE
Language: Spanish Sign Language (code: ssp
)
Family: Sign Language
This treebank has been part of Universal Dependencies since the UD v2.15 release.
The following people have contributed to making this treebank part of UD: José María García-Miguel, Carmen Cabeza.
Repository: UD_Spanish_Sign_Language-LSE
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15
License: CC BY-SA 4.0
Genre: fiction
Questions, comments? General annotation questions (either Spanish Sign Language-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [gallego (æt) uvigo • gal]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.
Annotation | Source |
---|---|
Lemmas | annotated manually |
UPOS | annotated manually, natively in UD style |
XPOS | not available |
Features | annotated manually, natively in UD style |
Relations | annotated manually, natively in UD style |
Description
The Universal Dependency treebank for Spanish Sign Language (Lengua de Signos Española [LSE], ISO 639-3: ssp) was developed by the GRADES group at the University of Vigo.
This first release of the LSE-RADIS treebank comprises 488 sentences and 1881 tokens from three narratives and a set of 100 elicited isolated sentences. Hopefully, future releases will include more texts and sentences.
The LSE-RADIS dependency treebank is derived from RADIS corpus, which consists of 48 recordings of elicited narratives, interviews and isolated sentences signed in LSE. The eaf ELAN files with glossing and grammatical annotation are available at Zenodo.
Most recordings, along with glosses and translations, can be browsed and searched at http://isignos.uvigo.gal/en, where a Signbank lexicon with all the id-glosses is also available.
Acknowledgments
Glossing of the video recordings was performed mainly by Juan Ramón Valiño and Ania Pérez.
Manual grammatical annotation of the corpus, including syntactic annotation, format conversions and revisions was performed by Carmen Cabeza and Jose M. García-Miguel.
References
- Cabeza, Carmen / García-Miguel, José M. (dirs): iSignos: Interfaz de datos de Lengua de Signos Española (versión 1.0). Universidade de Vigo. http://isignos.uvigo.gal/en
- Cabeza Pereiro, María del Carmen, Ania Pérez Pérez, Juan R. Valiño Freire & José M. García-Miguel Gallego. 2024. Annotations for LSE-RADIS corpus. Zenodo. https://doi.org/10.5281/zenodo.10670864.
- García-Miguel, José M. & Carmen Cabeza. 2019. Hacia un treebank de dependencias para la LSE. Hesperia 22(2). 111–143. https://doi.org/10.35869/hafh.v23i0.1657.
- Pérez, Ania, José Mª García-Miguel & Carmen Cabeza. 2019. Anotación de corpus para o estudo da expresión gramatical de eventos: notas sobre o deseño do proxecto RADIS. Sensos-e 6(1). 40–61. https://doi.org/10.34630/sensos-e.v6i1.3527.
Statistics of UD Spanish Sign Language LSE
POS Tags
ADJ – ADP – ADV – AUX – CCONJ – DET – NOUN – NUM – PART – PRON – SCONJ – VERB – X
Features
Relations
acl – advcl – advmod – amod – appos – aux – case – cc – ccomp – compound:redup – compound:svc – compound:vsc – conj – dep – det – discourse – iobj – mark – nmod – nsubj – nummod – obj – obl – parataxis – reparandum – root – vocative – xcomp
Tokenization and Word Segmentation
- This corpus contains 488 sentences and 1393 tokens.
- All tokens in this corpus are followed by a space.
- This corpus does not contain words with spaces.
- This corpus contains 292 types of words that contain both letters and punctuation. Examples: INDX.PRO:1sg, cl.m(A):agarrar-manillar, LOS-DOS, cl.e:CASA, SOPLAR(2M), cl.d(c):frasco-forma, cl.e:CASA-moverse, B.L(3), LO-DE-ANTES, PEQUEÑO2(2M), cl.c(1):PATAS-andar, (CNM), BUSCAR(2Msu), OTRA-VEZ, RANA(M-DE), cl.c(1):PATAS-retroceder, cl.e(2d):PERSONA-ascender+a-árbol, cl.e:CASA-desintegrarse, cl.m(5d>5):coger+guardar-fruta, B.L(2), DAR(2M), ES.SEGURO, ES.SEGURO(1M), INDX.PRO:3pl, JUGAR(CT), LOS-TRES, MIRAR(2M), PENSAR(2Msu), PREPARAR(1M), cl.d(5):pecho-hinchar, cl.e(1):PALOS-ascender, cl.e(1):PERSONA-aproximarse, cl.e(3):PERSONAS3-aproximarse, cl.e(3):PERSONAS3-desplazarse, cl.m(Xc):colocar-cesta, cl.m(Xc):golpear+con-palo, CASA(M-AB), G(5):desesperación, G(B):vale, G:¡ah!, GRACIAS(1M), GRITAR(2M), HOMBR-, INDX.AUX, INDX.LOC, LISTO(1M), LLAMAR(MP), LOBO(1M), MIRA-TU, MUJE-
Morphology
Tags
- This corpus uses 13 UPOS tags out of 17 possible: ADJ, ADP, ADV, AUX, CCONJ, DET, NOUN, NUM, PART, PRON, SCONJ, VERB, X
- This corpus does not use the following tags: PROPN, INTJ, SYM, PUNCT
- This corpus contains 1 word types tagged as particles (PART): NO
- This corpus contains 1 lemmas tagged as pronouns (PRON): _
- This corpus contains 1 lemmas tagged as determiners (DET): _
- Out of the above, 1 lemmas occurred sometimes as PRON and sometimes as DET: _
- This corpus contains 1 lemmas tagged as auxiliaries (AUX): _
- Out of the above, 1 lemmas occurred sometimes as AUX and sometimes as VERB: _
- This corpus does not use the VerbForm feature.
Nominal Features
Degree and Polarity
Verbal Features
Pronouns, Determiners, Quantifiers
Other Features
Syntax
Auxiliary Verbs and Copula
- This corpus does not contain copulas.
- This corpus uses 1 lemmas as auxiliaries (aux). Examples: _.
Core Arguments, Oblique Arguments and Adjuncts
Here we consider only relations between verbs (parent) and nouns or pronouns (child).
- nsubj
- VERB--NOUN (161)
- VERB--PRON (26)
- obj
- VERB--NOUN (89)
- VERB--PRON (3)
- iobj
- VERB--NOUN (9)
- VERB--PRON (2)
Relations Overview
- This corpus uses 3 relation subtypes: compound:redup, compound:svc, compound:vsc
- The following 1 main types are not used alone, they are always subtyped: compound
- The following 11 relation types are not used in this corpus at all: csubj, expl, dislocated, cop, clf, fixed, flat, list, orphan, goeswith, punct