UD for Old French
Sentence segmentation
Due to the natures of the original texts where sentence boundaries are not usually clearly marked by punctuation, we keep only one main clause per tree regardless of the punctuation added by modern transcriptions. See for instance in La queste del saint Graal,
Certes, fet li rois, Keus, vos dites voir, ceste costume ai je toz jorz tenue et la tendrai tant com je porrai mes je avoie si grant joie de Lancelot et de ses cousins qui estoient venu a cort sain et haitié qu’il ne me sovenoit de la costume.
Is split in four trees:
- “Certes, fet li rois, Keus, vos dites voir”
- “ceste costume ai je toz jorz tenue”
- “et la tendrai tant com je porrai”
- “mes je avoie si grant joie de Lancelot et de ses cousins qui estoient venu a cort sain et haitié qu’il ne me sovenoit de la costume.”
Tokenization and Word Segmentation
- Words are delimited by whitespace or punctuation
- Contractions such as sin = si + en are rendered as single words instead of multiword tokens, though this will probably change in the future. This is also reflected in their relation subtypes (see below).
Morphology
The following custom morphological features are used:
- Morph=VFin : finite verb
- Morph=VInf : non finite verb
- Morph=VPar : verbal participle
Syntax
The following relation subtypes are used:
acl:relcl
: relative clauseadvmod:obl
: contractedadvmod
andobl
(eg. sin = si + en)aux:pass
: passive auxiliarycase:det
: contractedcase
anddet
(eg. del = de + le)cc:nc
: non coordinating conjunction (eg. et at the beginning of a sentence)mark:advmod
:mark
andadvmod
(eg. coment at the beginning of a subordinate clause)nsubj:advmod
: contractednsubj
andadvmod
(eg. jon = jo + en)nsubj:obj
: contractednsubj
andobj
(eg. quil = qui + le)obj:advmod
: contractedadvmod
andobj
(eg. sis = si + les)obj:advneg
: contractednegation
andobj
(eg. nes = ne + les)obj:obl
: contractedobl
andobj
(eg. oul = ou + le)obl:advmod
: the double labelling accounts for the difficulty to decide between obl and advmod relations (en
andi
).
Treebanks
There is 1 Old French UD treebank: