home edit page issue tracker

This page pertains to UD version 2.

UD Turkish Kenet

Language: Turkish (code: tr)
Family: Turkic

This treebank has been part of Universal Dependencies since the UD v2.8 release.

The following people have contributed to making this treebank part of UD: Aslı Kuzgun, Neslihan Cesur, Olcay Taner Yıldız, Oğuzhan Kuyrukçu, Arife Betül Yenice, Bilge Nas Arıcan, Ezgi Sanıyar.

Repository: UD_Turkish-Kenet
Search this treebank on-line: PML-TQ
Download all treebanks: UD 2.15

License: CC BY-SA 4.0

Genre: grammar-examples

Questions, comments? General annotation questions (either Turkish-specific or cross-linguistic) can be raised in the main UD issue tracker. You can report bugs in this treebank in the treebank-specific issue tracker on Github. If you want to collaborate, please contact [kuzgunasli (æt) gmail • com / olcay • yildiz (æt) ozyegin • edu • tr]. Development of the treebank happens outside the UD repository. If there are bugs, either the original data source or the conversion procedure must be fixed. Do not submit pull requests against the UD repository.

Annotation Source
Lemmas annotated manually in non-UD style, automatically converted to UD
UPOS annotated manually in non-UD style, automatically converted to UD
XPOS annotated manually in non-UD style, automatically converted to UD
Features annotated manually in non-UD style, automatically converted to UD
Relations annotated manually in non-UD style, automatically converted to UD

Description

Turkish-Kenet UD Treebank is the biggest treebank of Turkish. It consists of 18,700 manually annotated sentences and 178,700 tokens. Its corpus consists of dictionary examples.

This treebank is fully manually annotated and it includes 18,700 manually annotated sentences and 178,700 tokens. The sentences are taken from the Turkish wordnet Kenet, which includes word definitions from the example sentences of the dictionary of the Turkish Language Association. The domain is general. This is because the dictionary examples include sentences from novels, daily speech, and some amount of poem lines. It includes 9,350 test and 9,350 training sentences.

Acknowledgments

We wish to thank all the contributors and the Starlang Software for funding and supporting this work.

Statistics of UD Turkish Kenet

POS Tags

ADJADPADVAUXCCONJDETINTJNOUNNUMPRONPROPNPUNCTSCONJVERBX

Features

AspectCaseDefiniteDegreeMoodNumberNumber[psor]NumTypePersonPerson[psor]PolarityPronTypeReflexTenseVerbFormVoice

Relations

acladvcladvmodamodapposauxcaseccccompclfcompoundconjcsubjdepdetdiscoursedislocatedfixedflatiobjlistmarknmodnsubjnummodobjoblorphanparataxispunctreparandumrootvocativexcomp

Tokenization and Word Segmentation

Morphology

Tags

Nominal Features

Degree and Polarity

Verbal Features

Pronouns, Determiners, Quantifiers

Other Features

Syntax

Auxiliary Verbs and Copula

Core Arguments, Oblique Arguments and Adjuncts

Here we consider only relations between verbs (parent) and nouns or pronouns (child).

Verbs with Reflexive Core Objects

Relations Overview