home edit page issue tracker

This page pertains to UD version 2.

UD Romanian TueCL

Language: Romanian (code: ro)
Family: IE

This treebank has been part of Universal Dependencies since the UD v2.14 release.

The following people have contributed to making this treebank part of UD: Diana Hoefels, Çağrı Çöltekin.

Repository: UD_Romanian-TueCL
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15

License: CC BY-SA 4.0

Genre: social

Questions, comments? General annotation questions (either Romanian-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [diana-constantina • hoefels (æt) student • uni-tuebingen • de or diana • hoefels (æt) gmail • com, cagri • coeltekin (æt) uni-tuebingen • de]. Development of the treebank happens directly in the UD repository, so you may submit bug fixes as pull requests against the dev branch.

Annotation Source
Lemmas annotated manually
UPOS annotated manually, natively in UD style
XPOS not available
Features annotated manually, natively in UD style
Relations annotated manually, natively in UD style

Description

The Romanian Social Media Sexist Language UD Treebank is a reference treebank in Universal Dependencies (UD) format for Romanian sexist language. Currently small, it comprises a subset of tweets sourced from CoRoSeOf.

The Romanian Social Media Sexist Language UD Treebank is a specialized linguistic resource focused on analyzing sexist language in Romanian social media. It contains 210 annotated tweets selected from CoRoSeOf, providing a unique insight into social media discourse. As part of the UD_Romanian-TueCL project, it fills a significant gap in Romanian linguistic resources by being the first UD treebank to specifically address sexist language in the social media genre. The project is work-in-progress and the treebank is being updated on a regular basis.

Acknowledgments

The creation of this treebank was made possible through the initiative of Dr. Çağrı Çöltekin, lecturer @University of Tuebingen, as part of a course project focused on low-resourced languages. While Romanian is not a low-resourced language, it lacked a UD-compliant social media corpus. Diana C. Hoefels constructed and annotated the corpus, while Dr. Çağrı Çöltekin provided reviewing, consultation on the guidelines, and authored the documentation.

References

Statistics of UD Romanian TueCL

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPARTPRONPROPNPUNCTSCONJSYMVERBX

Features

AbbrAdpTypeCaseDefiniteDegreeForeignGenderMoodNumberNumber[psor]NumFormNumTypePartTypePersonPolarityPositionPossPronTypeReflexStrengthTenseTypoVariantVerbForm

Relations

acladvcladvcl:tcladvmodadvmod:tmodamodapposauxaux:passcasecccc:preconjccompccomp:pmodcompoundconjcopcsubjdepdetdiscoursediscourse:emoexplexpl:passexpl:possexpl:pvfixedflatgoeswithiobjlistmarknmodnsubjnsubj:passnummodobjoblobl:agentobl:pmodobl:tmodorphanparataxispunctreparandumrootvocativevocative:mentionxcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Reflexive Verbs

Reflexive Passive

Verbs with Reflexive Core Objects

Relations Overview