home edit page issue tracker

This page pertains to UD version 2.

Enhanced Dependencies

We always intended the Universal Dependencies representation to be used in shallow natural language understanding tasks such as relation extraction or biomedical event extraction. For such tasks, one is typically interested in the relation between certain entities, e.g., the relation between two persons or whether one protein interacts with another. UD is particularly well suited for such tasks as UD trees contain many direct dependencies between content words and many of the dependency labels provide a lot of information about the type of relation between two content words. However, for some constructions, the dependency path between two content words of interest can be very long in a UD tree, which complicates determining how the content words are related. Further, some dependency types such as obl or nmod are used for many different types of arguments and modifiers, and therefore they are not very informative on their own. For these reasons, we also provide guidelines for an enhanced representation, which makes some of the implicit relations between words more explicit, and augments some of the dependency labels to facilitate the disambiguation of types of arguments and modifiers.

Enhanced UD graphs may contain some or all of the following enhancements, which are described in the sections below. If a corpus does not annotate any of the enhancements defined in the guidelines, it should always have the underscore character in the DEPS column. That is, the enhanced graph should not be just an exact copy of the basic tree for all sentences in the corpus. Otherwise it creates the impression that the user can expect some enhancements while there are actually none.

Note that the enhanced graph is not necessarily a supergraph of the basic tree, i.e., the graph is not required to contain all the basic dependency relations. For this reason, all relations of the enhanced graph (also the ones that are present in the basic UD tree) have to be included in the DEPS column of a CoNLL-U file. See the specificiation of the CoNLL-U file format for details.

Furthermore, the dependency relation labels in the enhanced graph in DEPS may contain certain extensions that are not permitted in the basic relation type in the DEPREL column. The regular expression restricting relation labels in DEPREL is pretty simple; the label can contain only lowercase English letters and at most one colon, which separates the universal and the language-specific part of the label: ^[a-z]+(:[a-z]+)?$. In contrast, the relation label in DEPS may contain up to three colons, separating up to four sections. One of the sections (never the first one) may also contain lowercase Unicode letters and the underscore character: ^[a-z]+(:[a-z]+)?(:[\p{Ll}\p{Lm}\p{Lo}\p{M}]+(_[\p{Ll}\p{Lm}\p{Lo}\p{M}]+)*)?(:[a-z]+)?$. Only the first section, the universal relation, is mandatory. The other sections are optional but if they appear, they must appear in the order described below. We provide a more detailed explanation of the extra sections later on this page; here is a summary:

  1. Universal dependency relation. In addition to the 37 relations defined in the basic representation, the relation can also be ref.
  2. Documented relation subtype (either language-specific or more general) from the basic representation.
  3. The string xsubj, denoting external subject relations of xcomp predicates. This extension is used only with nsubj, csubj, and their subtypes such as nsubj:pass. It does not combine with the other extensions described below because they do not apply to subjects.
  4. Case and similar information – adposition or conjunction that occurs as a case, mark or cc dependent of the node whose relation to its parent is being enhanced. Note that this is the only part where non-ASCII letters are permitted within the enhanced relation label. The word should be normalized (lowercased, no typos), i.e., in general we take its lemma. However, if the case/mark dependent is a fixed multi-word expression, the lemma of the expression is not necessarily composed of lemmas of the individual member words. For instance, the string representing the English expression “As Opposed To” is as_opposed_to. That is, the casing is normalized from “As” to “as” etc., but “opposed” is not replaced by its lemma “oppose” because the expression is fixed. Similarly, grammaticalized deverbal connectives such as “regarding” may in some languages (if required by the language-specific guidelines) still be tagged VERB, despite being attached as case, and their lemma will thus be verbal (“regard”); nevertheless, the corresponding deprel extension should be the grammaticalized form, i.e., “regarding”. Language-specific guidelines may also specify that certain synonyms (e.g., “toward” and “towards”) be mapped on the same enhanced label, despite having different lemmas. We use the underscore character (“_”) to connect member words. The same approach can also be taken when a node has multiple case markers that are not annotated as a fixed expression, e.g., out_of for “out of business”.
  5. Case information – morphological case of the node whose relation to its parent is being enhanced. Value corresponds to the value of the Case feature but it is lowercased (e.g., gen instead of Gen). Unlike in morphological features, multivalues with comma (Case=Acc,Dat) are not allowed. Case information in enhanced relations must be fully disambiguated.

Ellipsis

(See also the guidelines on ellipsis.)

In the enhanced representation, we add special empty (null) nodes in clauses in which a predicate is elided. (Although the node is termed ‘empty’ in the CoNLL-U format specification, and although it does not correspond to an overt surface token, its FORM, LEMMA, UPOS, XPOS and FEATS may be optionally filled with the assumed values; here they can be copied from the overt occurrence of the predicate.)

# visual-style 5 6 orphan color:red # visual-style 2 5 conj color:red # visual-style 5 4 cc color:red 1 I _ _ _ _ 2 nsubj _ _ 2 like _ _ _ _ 0 root _ _ 3 tea _ _ _ _ 2 obj _ _ 4 and _ _ _ _ 5 cc _ _ 5 you _ _ _ _ 2 conj _ _ 6 coffee _ _ _ _ 5 orphan _ _ 7 . _ _ _ _ 2 punct _ _
# visual-style 6 7 obj color:blue # visual-style 6 5 nsubj color:blue # visual-style 2 6 conj color:blue # visual-style 6 4 cc color:blue 1 I _ _ _ _ 2 nsubj _ _ 2 like _ _ _ _ 0 root _ _ 3 tea _ _ _ _ 2 obj _ _ 4 and _ _ _ _ 6 cc _ _ 5 you _ _ _ _ 6 nsubj _ _ 6 E5.1 _ _ _ _ 2 conj _ _ 7 coffee _ _ _ _ 6 obj _ _ 8 . _ _ _ _ 2 punct _ _
# visual-style 8 10 orphan color:red # visual-style 2 8 conj color:red # visual-style 8 7 cc color:red 1 Mary _ _ _ _ 2 nsubj _ _ 2 wants _ _ _ _ 0 root _ _ 3 to _ _ _ _ 4 mark _ _ 4 buy _ _ _ _ 2 xcomp _ _ 5 a _ _ _ _ 6 det _ _ 6 book _ _ _ _ 4 obj _ _ 7 and _ _ _ _ 8 cc _ _ 8 Jenny _ _ _ _ 2 conj _ _ 9 a _ _ _ _ 10 det _ _ 10 CD _ _ _ _ 8 orphan _ _ 11 . _ _ _ _ 2 punct _ _
# visual-style 9 8 nsubj color:blue # visual-style 10 12 obj color:blue # visual-style 9 10 xcomp color:blue # visual-style 9 7 cc color:blue # visual-style 4 1 nsubj color:blue # visual-style 10 8 nsubj color:blue 1 Mary _ _ _ _ 2 nsubj 4:nsubj _ 2 wants _ _ _ _ 0 root _ _ 3 to _ _ _ _ 4 mark _ _ 4 buy _ _ _ _ 2 xcomp _ _ 5 a _ _ _ _ 6 det _ _ 6 book _ _ _ _ 4 obj _ _ 7 and _ _ _ _ 9 cc _ _ 8 Jenny _ _ _ _ 9 nsubj 10:nsubj _ 9 E8.1 _ _ _ _ 2 conj _ _ 10 E8.2 _ _ _ _ 9 xcomp _ _ 11 a _ _ _ _ 12 det _ _ 12 CD _ _ _ _ 10 obj _ _ 13 . _ _ _ _ 2 punct _ _

Note that this is a case in which the enhanced UD graph is not a supergraph of the basic tree as the basic tree contains orphan relations, which are not present in the enhanced UD graph.

Propagation of incoming dependencies to conjuncts

In the basic representation, the governor and dependents of a conjoined phrase are all attached to the first conjunct. This often leads to very long dependency paths between content words. The enhanced representation therefore also contains dependencies between the other conjuncts and the governor and dependents of the phrase.

Conjoined subjects and objects

When the subject is a conjoined noun phrase, each of the conjuncts is attached to the predicate.

1 Paul _ _ _ _ 5 nsubj _ _ 2 and _ _ _ _ 3 cc _ _ 3 Mary _ _ _ _ 1 conj _ _ 4 are _ _ _ _ 5 aux _ _ 5 running _ _ _ _ 0 root _ _ 6 . _ _ _ _ 5 punct _ _
# visual-style 5 3 nsubj color:blue 1 Paul _ _ _ _ 5 nsubj _ _ 2 and _ _ _ _ 3 cc _ _ 3 Mary _ _ _ _ 1 conj 5:nsubj _ 4 are _ _ _ _ 5 aux _ _ 5 running _ _ _ _ 0 root _ _ 6 . _ _ _ _ 5 punct _ _

The same is true for conjoined objects.

1 Paul _ _ _ _ 2 nsubj _ _ 2 bought _ _ _ _ 0 root _ _ 3 apples _ _ _ _ 2 obj _ _ 4 and _ _ _ _ 5 cc _ _ 5 oranges _ _ _ _ 3 conj _ _ 6 . _ _ _ _ 2 punct _ _
# visual-style 2 5 obj color:blue 1 Paul _ _ _ _ 2 nsubj _ _ 2 bought _ _ _ _ 0 root _ _ 3 apples _ _ _ _ 2 obj _ _ 4 and _ _ _ _ 5 cc _ _ 5 oranges _ _ _ _ 3 conj 2:obj _ 6 . _ _ _ _ 2 punct _ _

This leads to slightly strange dependencies in the case of collective subjects or objects:

1 Paul _ _ _ _ 5 nsubj _ _ 2 and _ _ _ _ 3 cc _ _ 3 Mary _ _ _ _ 1 conj _ _ 4 are _ _ _ _ 5 aux _ _ 5 meeting _ _ _ _ 0 root _ _ 6 . _ _ _ _ 5 punct _ _
# visual-style 5 3 nsubj color:blue 1 Paul _ _ _ _ 5 nsubj _ _ 2 and _ _ _ _ 3 cc _ _ 3 Mary _ _ _ _ 1 conj 5:nsubj _ 4 are _ _ _ _ 5 aux _ _ 5 meeting _ _ _ _ 0 root _ _ 6 . _ _ _ _ 5 punct _ _
1 Mary _ _ _ _ 3 nsubj _ _ 2 is _ _ _ _ 3 aux _ _ 3 eating _ _ _ _ 0 root _ _ 4 mac _ _ _ _ 3 obj _ _ 5 and _ _ _ _ 6 cc _ _ 6 cheese _ _ _ _ 4 conj _ _ 7 . _ _ _ _ 3 punct _ _
# visual-style 3 6 obj color:blue 1 Mary _ _ _ _ 3 nsubj _ _ 2 is _ _ _ _ 3 aux _ _ 3 eating _ _ _ _ 0 root _ _ 4 mac _ _ _ _ 3 obj _ _ 5 and _ _ _ _ 6 cc _ _ 6 cheese _ _ _ _ 4 conj 3:obj _ 7 . _ _ _ _ 3 punct _ _

However, as the distinction between distributive and collective readings is often context-dependent, we take the simplest approach and always attach all conjuncts to the predicate.

When the subject is attached to a control or raising predicate, there is a dependency between the matrix verb and each conjunct and between the embedded verb and each conjunct.

1 Mary _ _ _ _ 4 nsubj _ _ 2 and _ _ _ _ 3 cc _ _ 3 John _ _ _ _ 1 conj _ _ 4 wanted _ _ _ _ 0 root _ _ 5 to _ _ _ _ 6 mark _ _ 6 buy _ _ _ _ 4 xcomp _ _ 7 a _ _ _ _ 8 det _ _ 8 hat _ _ _ _ 6 obj _ _ 9 . _ _ _ _ 4 punct _ _
# visual-style 4 3 nsubj color:blue # visual-style 6 1 nsubj color:blue # visual-style 6 3 nsubj color:blue 1 Mary _ _ _ _ 4 nsubj 6:nsubj _ 2 and _ _ _ _ 3 cc _ _ 3 John _ _ _ _ 1 conj 4:nsubj|6:nsubj _ 4 wanted _ _ _ _ 0 root _ _ 5 to _ _ _ _ 6 mark _ _ 6 buy _ _ _ _ 4 xcomp _ _ 7 a _ _ _ _ 8 det _ _ 8 hat _ _ _ _ 6 obj _ _ 9 . _ _ _ _ 4 punct _ _

Conjoined modifiers

Each conjunct in a conjoined modifier phrase gets attached to the governor of the modifier phrase. For example, the following phrase contains a conjoined adjectival phrase that modifies a noun. In the enhanced representation, there is an additional amod relation between the noun river and the second conjunct wide.

1 a _ _ _ _ 5 det _ _ 2 long _ _ _ _ 5 amod _ _ 3 and _ _ _ _ 4 cc _ _ 4 wide _ _ _ _ 2 conj _ _ 5 river _ _ _ _ 0 root _ _
# visual-style 5 4 amod color:blue 1 a _ _ _ _ 5 det _ _ 2 long _ _ _ _ 5 amod _ _ 3 and _ _ _ _ 4 cc _ _ 4 wide _ _ _ _ 2 conj 5:amod _ 5 river _ _ _ _ 0 root _ _

Propagation of outgoing dependencies from conjuncts

In the basic representation, the governor and dependents of a conjoined phrase are all attached to the first conjunct. This often leads to very long dependency paths between content words. The enhanced representation therefore also contains dependencies between the other conjuncts and the governor and dependents of the phrase.

Conjoined verbs and verb phrases

When two verbs share their objects (or other complements), the subject and the object of the conjoined verbs are attached to every conjunct.

1 The _ _ _ _ 2 det _ _ 2 store _ _ _ _ 3 nsubj _ _ 3 buys _ _ _ _ 0 root _ _ 4 and _ _ _ _ 5 cc _ _ 5 sells _ _ _ _ 3 conj _ _ 6 cameras _ _ _ _ 3 obj _ _ 7 . _ _ _ _ 3 punct _ _
# visual-style 5 2 nsubj color:blue # visual-style 5 6 obj color:blue 1 The _ _ _ _ 2 det _ _ 2 store _ _ _ _ 3 nsubj 5:nsubj _ 3 buys _ _ _ _ 0 root _ _ 4 and _ _ _ _ 5 cc _ _ 5 sells _ _ _ _ 3 conj _ _ 6 cameras _ _ _ _ 3 obj 5:obj _ 7 . _ _ _ _ 3 punct _ _

However, if the complements of the second verb are not shared, only the shared dependents are attached to every conjunct.

1 She _ _ _ _ 3 nsubj _ _ 2 was _ _ _ _ 3 aux _ _ 3 reading _ _ _ _ 0 root _ _ 4 or _ _ _ _ 5 cc _ _ 5 watching _ _ _ _ 3 conj _ _ 6 a _ _ _ _ 7 det _ _ 7 movie _ _ _ _ 5 obj _ _ 8 . _ _ _ _ 3 punct _ _
# visual-style 5 1 nsubj color:blue # visual-style 5 2 aux color:blue 1 She _ _ _ _ 3 nsubj 5:nsubj _ 2 was _ _ _ _ 3 aux 5:aux _ 3 reading _ _ _ _ 0 root _ _ 4 or _ _ _ _ 5 cc _ _ 5 watching _ _ _ _ 3 conj _ _ 6 a _ _ _ _ 7 det _ _ 7 movie _ _ _ _ 5 obj _ _ 8 . _ _ _ _ 3 punct _ _

Similarly, the enhanced representation can also distinguish private dependents of the first verb. Note however that in this case it cannot be inferred from the basic representation automatically.

1 She _ _ _ _ 3 nsubj _ _ 2 was _ _ _ _ 3 aux _ _ 3 watching _ _ _ _ 0 root _ _ 4 a _ _ _ _ 5 det _ _ 5 movie _ _ _ _ 3 obj _ _ 6 or _ _ _ _ 7 cc _ _ 7 reading _ _ _ _ 3 conj _ _ 8 . _ _ _ _ 3 punct _ _
# visual-style 7 1 nsubj color:blue # visual-style 7 2 aux color:blue 1 She _ _ _ _ 3 nsubj 7:nsubj _ 2 was _ _ _ _ 3 aux 7:aux _ 3 watching _ _ _ _ 0 root _ _ 4 a _ _ _ _ 5 det _ _ 5 movie _ _ _ _ 3 obj _ _ 6 or _ _ _ _ 7 cc _ _ 7 reading _ _ _ _ 3 conj _ _ 8 . _ _ _ _ 3 punct _ _

Controlled/raised subjects

The basic trees lack a subject dependency between a controlled verb and its controller or between an embedded verb and its raised subject. In the enhanced graph, there is an additional dependency between the embedded verb and the subject of the matrix clause. This dependency can be recognized by the extension (subtype) :xsubj.

BasicEnhanced
1 Mary _ _ _ _ 2 nsubj _ _ 2 wants _ _ _ _ 0 root _ _ 3 to _ _ _ _ 4 mark _ _ 4 buy _ _ _ _ 2 xcomp _ _ 5 a _ _ _ _ 6 det _ _ 6 book _ _ _ _ 4 obj _ _ 7 . _ _ _ _ 2 punct _ _
# visual-style 4 1 nsubj:xsubj color:blue 1 Mary _ _ _ _ 2 nsubj 4:nsubj:xsubj _ 2 wants _ _ _ _ 0 root _ _ 3 to _ _ _ _ 4 mark _ _ 4 buy _ _ _ _ 2 xcomp _ _ 5 a _ _ _ _ 6 det _ _ 6 book _ _ _ _ 4 obj _ _ 7 . _ _ _ _ 2 punct _ _
1 She _ _ _ _ 2 nsubj _ _ 2 seems _ _ _ _ 0 root _ _ 3 to _ _ _ _ 5 mark _ _ 4 be _ _ _ _ 5 aux _ _ 5 reading _ _ _ _ 2 xcomp _ _ 6 a _ _ _ _ 7 det _ _ 7 book _ _ _ _ 5 obj _ _ 8 . _ _ _ _ 2 punct _ _
# visual-style 5 1 nsubj:xsubj color:blue 1 She _ _ _ _ 2 nsubj 5:nsubj:xsubj _ 2 seems _ _ _ _ 0 root _ _ 3 to _ _ _ _ 5 mark _ _ 4 be _ _ _ _ 5 aux _ _ 5 reading _ _ _ _ 2 xcomp _ _ 6 a _ _ _ _ 7 det _ _ 7 book _ _ _ _ 5 obj _ _ 8 . _ _ _ _ 2 punct _ _
1 Mary _ _ _ _ 2 nsubj _ _ 2 made _ _ _ _ 0 root _ _ 3 me _ _ _ _ 2 obj _ _ 4 buy _ _ _ _ 2 xcomp _ _ 5 the _ _ _ _ 6 det _ _ 6 house _ _ _ _ 4 obj _ _ 7 . _ _ _ _ 2 punct _ _
# visual-style 4 3 nsubj:xsubj color:blue 1 Mary _ _ _ _ 2 nsubj _ _ 2 made _ _ _ _ 0 root _ _ 3 me _ _ _ _ 2 obj 4:nsubj:xsubj _ 4 buy _ _ _ _ 2 xcomp _ _ 5 the _ _ _ _ 6 det _ _ 6 house _ _ _ _ 4 obj _ _ 7 . _ _ _ _ 2 punct _ _
1 Mary _ _ _ _ 2 nsubj _ _ 2 wants _ _ _ _ 0 root _ _ 3 me _ _ _ _ 2 obj _ _ 4 to _ _ _ _ 6 mark _ _ 5 be _ _ _ _ 6 aux:pass _ _ 6 promoted _ _ _ _ 2 xcomp _ _ 7 . _ _ _ _ 2 punct _ _
# visual-style 6 3 nsubj:pass:xsubj color:blue 1 Mary _ _ _ _ 2 nsubj _ _ 2 wants _ _ _ _ 0 root _ _ 3 me _ _ _ _ 2 obj 6:nsubj:pass:xsubj _ 4 to _ _ _ _ 6 mark _ _ 5 be _ _ _ _ 6 aux:pass _ _ 6 promoted _ _ _ _ 2 xcomp _ _ 7 . _ _ _ _ 2 punct _ _

Relative clauses

In basic trees, relative pronouns are attached to the main predicate of the relative clause (typically with a nsubj or obj relation). In the corresponding enhanced graphs, the relative pronoun is attached to its antecedent with the special ref relation and the antecedent is attached as a dependent of the node that is the parent of the relative pronoun in the basic tree. Typically this parent is the main predicate of the relative clause, but it is not always so (see examples below).

In the case where there is no explicit relative pronoun, there is no ref relation in the enhanced graph but the antecedent is still annotated as a dependent of a node in the relative clause, depending on the role it plays in the relative clause.

Note that such graphs contain a cycle.

# visual-style 4 3 nsubj color:red 1 the the DET _ Definite=Def|PronType=Art 2 det _ _ 2 boy boy NOUN _ Gender=Masc|Number=Sing 0 root _ _ 3 who who PRON _ PronType=Rel 4 nsubj _ _ 4 lived lived VERB _ Mood=Ind|Tense=Past|VerbForm=Fin 2 acl:relcl _ _
# visual-style 4 2 nsubj color:blue # visual-style 2 3 ref color:blue 1 the the DET _ Definite=Def|PronType=Art 2 det _ _ 2 boy boy NOUN _ Gender=Masc|Number=Sing 0 root 4:nsubj _ 3 who who PRON _ PronType=Rel 2 ref _ _ 4 lived lived VERB _ Mood=Ind|Tense=Past|VerbForm=Fin 2 acl:relcl _ _
# visual-style 5 3 obj color:red 1 the the DET _ Definite=Def|PronType=Art 2 det _ _ 2 book book NOUN _ Gender=Neut|Number=Sing 0 root _ _ 3 that that PRON _ PronType=Rel 5 obj _ _ 4 I I PRON _ Number=Sing|Person=1|PronType=Prs 5 nsubj _ _ 5 read read VERB _ Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin 2 acl:relcl _ _
# visual-style 5 2 obj color:blue # visual-style 2 3 ref color:blue 1 the the DET _ Definite=Def|PronType=Art 2 det _ _ 2 book book NOUN _ Gender=Neut|Number=Sing 0 root 5:obj _ 3 that that PRON _ PronType=Rel 2 ref _ _ 4 I I PRON _ Number=Sing|Person=1|PronType=Prs 5 nsubj _ _ 5 read read VERB _ Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin 2 acl:relcl _ _
1 the the DET _ Definite=Def|PronType=Art 2 det _ _ 2 book book NOUN _ Gender=Neut|Number=Sing 0 root _ _ 3 I I PRON _ Number=Sing|Person=1|PronType=Prs 4 nsubj _ _ 4 read read VERB _ Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin 2 acl:relcl _ _
# visual-style 4 2 obj color:blue 1 the the DET _ Definite=Def|PronType=Art 2 det _ _ 2 book book NOUN _ Gender=Neut|Number=Sing 0 root 4:obj _ 3 I I PRON _ Number=Sing|Person=1|PronType=Prs 4 nsubj _ _ 4 read read VERB _ Mood=Ind|Number=Sing|Person=1|Tense=Pres|VerbForm=Fin 2 acl:relcl _ _

Adverbial relativizers receive the same treatment.

# visual-style 5 3 advmod color:red 1 the the DET DT Definite=Def|PronType=Art 2 det _ _ 2 episode episode NOUN NN Number=Sing 0 root _ _ 3 where where ADV WRB PronType=Rel 5 advmod _ _ 4 Monica Monica PROPN NNP Number=Sing 5 nsubj _ _ 5 sings sing VERB VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 2 acl:relcl _ _
# visual-style 2 3 ref color:blue # visual-style 5 2 obl color:blue 1 the the DET DT Definite=Def|PronType=Art 2 det _ _ 2 episode episode NOUN NN Number=Sing 0 root 5:obl _ 3 where where ADV WRB PronType=Rel 2 ref _ _ 4 Monica Monica PROPN NNP Number=Sing 5 nsubj _ _ 5 sings sing VERB VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 2 acl:relcl _ _

The enhanced relations include deep syntactic relations. Therefore, in case marking languages the enhanced dependencies may link verb dependents that are not in the expected morphological case, required by surface syntax. In the following Czech example, the relative modifier phrase v němž “in which” is obligatorily in the locative case form (Case=Loc). If it were a main clause, the referent dům “house” would have to be in locative too: v domě “in house”. However, here it is in the nominative (Case=Nom), and the enhanced dependency obl going to a nominative dependent is something we would not expect to see, given the morpho-syntactic rules of the language.

# visual-style 5 4 obl color:red 1 dům house NOUN _ Animacy=Inan|Case=Nom|Gender=Masc|Number=Sing 0 root _ _ 2 , , PUNCT _ _ 5 punct _ _ 3 v in ADP _ _ 4 case _ _ 4 němž that PRON _ Case=Loc|Gender=Masc|Number=Sing|PronType=Rel 5 obl _ _ 5 žijeme live VERB _ Aspect=Imp|Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 1 acl:relcl _ _
# visual-style 5 1 obl color:blue # visual-style 1 4 ref color:blue 1 dům house NOUN _ Animacy=Inan|Case=Nom|Gender=Masc|Number=Sing 0 root 5:obl _ 2 , , PUNCT _ _ 5 punct _ _ 3 v in ADP _ _ 4 case _ _ 4 němž that PRON _ Case=Loc|Gender=Masc|Number=Sing|PronType=Rel 1 ref _ _ 5 žijeme live VERB _ Aspect=Imp|Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 1 acl:relcl _ _

The relative element does not always depend directly on the predicate of the relative clause. It may be embedded deeper as in the following example.

# visual-style 5 4 det color:red 1 muž man NOUN _ Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing 0 root _ _ 2 , , PUNCT _ _ 6 punct _ _ 3 v in ADP _ _ 5 case _ _ 4 jehož whose DET _ Gender[psor]=Masc|Number[psor]=Plur|Poss=Yes|PronType=Rel 5 det _ _ 5 domě house NOUN _ Animacy=Inan|Case=Loc|Gender=Masc|Number=Sing 6 obl _ _ 6 žijeme live VERB _ Aspect=Imp|Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 1 acl:relcl _ _
# visual-style 5 1 nmod color:blue # visual-style 1 4 ref color:blue 1 muž man NOUN _ Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing 0 root 5:nmod _ 2 , , PUNCT _ _ 6 punct _ _ 3 v in ADP _ _ 5 case _ _ 4 jehož whose DET _ Gender[psor]=Masc|Number[psor]=Plur|Poss=Yes|PronType=Rel 1 ref _ _ 5 domě house NOUN _ Animacy=Inan|Case=Loc|Gender=Masc|Number=Sing 6 obl _ _ 6 žijeme live VERB _ Aspect=Imp|Mood=Ind|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin|Voice=Act 1 acl:relcl _ _

If the relative clause has a nominal predicate, the relative pronoun may occupy the head position within the clause. Unlike most relative clauses, here the parent of the relative pronoun in the basic tree is not inside the relative clause, and its antecedent will not have an additional enhanced relation attaching it to a (non-existent) parent in the relative clause. Instead, we add a nsubj relation from the antecedent to the nsubj of the relative clause (and remove the corresponding nsubj relation between the relative pronoun and the subject). The acl:relcl should remain the same as in basic dependencies.

# visual-style 5 6 nsubj color:red 1 He he PRON _ Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs 2 nsubj _ _ 2 became become VERB _ Mood=Ind|Tense=Past|VerbForm=Fin 0 root _ _ 3 chairman chairman NOUN _ Number=Sing 2 xcomp _ SpaceAfter=No 4 , , PUNCT _ _ 5 punct _ _ 5 which which PRON _ PronType=Rel 3 acl:relcl _ _ 6 he he PRON _ Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs 5 nsubj _ _ 7 still still ADV _ _ 5 advmod _ _ 8 is be AUX _ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 5 cop _ SpaceAfter=No 9 . . PUNCT _ _ 2 punct _ _
# visual-style 3 6 nsubj color:blue # visual-style 3 5 ref color:blue 1 He he PRON _ Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs 2 nsubj _ _ 2 became become VERB _ Mood=Ind|Tense=Past|VerbForm=Fin 0 root _ _ 3 chairman chairman NOUN _ Number=Sing 2 xcomp _ SpaceAfter=No 4 , , PUNCT _ _ 5 punct _ _ 5 which which PRON _ PronType=Rel 3 acl:relcl 3:ref _ 6 he he PRON _ Case=Nom|Gender=Masc|Number=Sing|Person=3|PronType=Prs 3 nsubj _ _ 7 still still ADV _ _ 5 advmod _ _ 8 is be AUX _ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 5 cop _ SpaceAfter=No 9 . . PUNCT _ _ 2 punct _ _

Case Information

Adding prepositions (or case information) to the relation name of non-core dependents often makes it possible to disambiguate its semantic role. We therefore augment certain relation labels with the case information of the modifier. The augmented relations are nmod, acl, obl and advcl; if it makes sense in the language, some core relations may also be augmented: obj, iobj, ccomp. Case information may be represented by the lemma of an adposition attached via a case relation. For clauses, the corresponding information may be represented by the lemma of a mark dependent instead. Case information may also be represented by the value of the morphological feature Case. In some languages, there is both the adposition and the morphological case, and their combination must be reflected in the enhanced relation.

In a similar manner, enhanced UD graphs also contain conj relations that are augmented with their coordinating conjunction. This makes the type of coordination between two phrases more explicit which is particularly useful in phrases with multiple coordinating conjunctions.

The following formal rules apply (copied from the summary at the beginning of this page):

# visual-style 2 5 nmod color:red 1 the _ _ _ _ 2 det _ _ 2 house _ _ _ _ 0 root _ _ 3 on _ _ _ _ 5 case _ _ 4 the _ _ _ _ 5 det _ _ 5 hill _ _ _ _ 2 nmod _ _
# visual-style 2 5 nmod:on color:blue 1 the _ _ _ _ 2 det _ _ 2 house _ _ _ _ 0 root _ _ 3 on _ _ _ _ 5 case _ _ 4 the _ _ _ _ 5 det _ _ 5 hill _ _ _ _ 2 nmod:on _ _
# visual-style 2 5 obl color:red # visual-style 2 7 advcl color:red 1 He _ _ _ _ 2 nsubj _ _ 2 went _ _ _ _ 0 root _ _ 3 to _ _ _ _ 5 case _ _ 4 the _ _ _ _ 5 det _ _ 5 dinner _ _ _ _ 2 obl _ _ 6 after _ _ _ _ 7 mark _ _ 7 leaving _ _ _ _ 2 advcl _ _ 8 work _ _ _ _ 7 obj _ _ 9 . _ _ _ _ 2 punct _ _
# visual-style 2 5 obl:to color:blue # visual-style 2 7 advcl:after color:blue 1 He _ _ _ _ 2 nsubj _ _ 2 went _ _ _ _ 0 root _ _ 3 to _ _ _ _ 5 case _ _ 4 the _ _ _ _ 5 det _ _ 5 dinner _ _ _ _ 2 obl:to _ _ 6 after _ _ _ _ 7 mark _ _ 7 leaving _ _ _ _ 2 advcl:after _ _ 8 work _ _ _ _ 7 obj _ _ 9 . _ _ _ _ 2 punct _ _
# visual-style 2 4 nmod color:red # text = the destruction of the city 1 die the DET _ Case=Gen 2 det _ _ 2 Zerstörung destruction NOUN _ Case=Nom 0 root _ _ 3 der the DET _ Case=Gen 4 det _ _ 4 Stadt city NOUN _ Case=Gen 2 nmod _ _
# visual-style 2 4 nmod:gen color:blue # text = the destruction of the city 1 die the DET _ Case=Gen 2 det _ _ 2 Zerstörung destruction NOUN _ Case=Nom 0 root _ _ 3 der the DET _ Case=Gen 4 det _ _ 4 Stadt city NOUN _ Case=Gen 2 nmod:gen _ _
# visual-style 2 5 obl color:red # text = He sits on the floor 1 Er he PRON _ Case=Nom 2 nsubj _ _ 2 sitzt sits NOUN _ _ 0 root _ _ 3 auf on ADP _ _ 5 case _ _ 4 dem the DET _ Case=Dat 5 det _ _ 5 Boden floor NOUN _ Case=Dat 2 obl _ SpaceAfter=No 6 . . PUNCT _ _ 2 punct _ _
# visual-style 2 5 obl:auf:dat color:blue # text = He sits on the floor 1 Er he PRON _ Case=Nom 2 nsubj _ _ 2 sitzt sits NOUN _ _ 0 root _ _ 3 auf on ADP _ _ 5 case _ _ 4 dem the DET _ Case=Dat 5 det _ _ 5 Boden floor NOUN _ Case=Dat 2 obl:auf:dat _ SpaceAfter=No 6 . . PUNCT _ _ 2 punct _ _
# visual-style 2 6 obl color:red # text = He sits down on the floor 1 Er he PRON _ Case=Nom 2 nsubj _ _ 2 setzt sets NOUN _ _ 0 root _ _ 3 sich himself PRON _ Case=Acc 2 expl:pv _ _ 4 auf on ADP _ _ 6 case _ _ 5 den the DET _ Case=Acc 6 det _ _ 6 Boden floor NOUN _ Case=Acc 2 obl _ SpaceAfter=No 7 . . PUNCT _ _ 2 punct _ _
# visual-style 2 6 obl:auf:acc color:blue # text = He sits down on the floor 1 Er he PRON _ Case=Nom 2 nsubj _ _ 2 setzt sets NOUN _ _ 0 root _ _ 3 sich himself PRON _ Case=Acc 2 expl:pv _ _ 4 auf on ADP _ _ 6 case _ _ 5 den the DET _ Case=Acc 6 det _ _ 6 Boden floor NOUN _ Case=Acc 2 obl:auf:acc _ SpaceAfter=No 7 . . PUNCT _ _ 2 punct _ _
# visual-style 5 4 obl:tmod color:red # visual-style 6 7 nmod color:red # text = For a long time he studied the Maya language. 1 В In ADP _ _ 4 case _ _ 2 течение duration NOUN _ Case=Loc 1 fixed _ _ 3 долгого long ADJ _ Case=Gen 4 amod _ _ 4 времени time NOUN _ Case=Gen 5 obl:tmod _ _ 5 изучал studied VERB _ _ 0 root _ _ 6 язык language NOUN _ Case=Acc 5 obj _ _ 7 майя Maya PROPN _ Case=Gen 6 nmod _ SpaceAfter=No 8 . . PUNCT _ _ 5 punct _ _
# visual-style 5 4 obl:tmod:в_течение:gen color:blue # visual-style 6 7 nmod:gen color:blue # text = For a long time he studied the Maya language. 1 В In ADP _ _ 4 case _ _ 2 течение duration NOUN _ Case=Loc 1 fixed _ _ 3 долгого long ADJ _ Case=Gen 4 amod _ _ 4 времени time NOUN _ Case=Gen 5 obl:tmod:в_течение:gen _ _ 5 изучал studied VERB _ _ 0 root _ _ 6 язык language NOUN _ Case=Acc 5 obj _ _ 7 майя Maya PROPN _ Case=Gen 6 nmod:gen _ SpaceAfter=No 8 . . PUNCT _ _ 5 punct _ _
# visual-style 3 7 obl color:red # visual-style 4 6 conj color:red # text = Lidé se rozutekli před a během útoku. 1 Lidé People NOUN _ Case=Nom 3 nsubj _ _ 2 se themselves PRON _ Case=Acc 3 expl:pv _ _ 3 rozutekli scattered VERB _ _ 0 root _ _ 4 před before ADP _ Case=Ins 7 case _ _ 5 a and CCONJ _ _ 6 cc _ _ 6 během during ADP _ Case=Gen 4 conj _ _ 7 útoku attack NOUN _ Case=Gen 3 obl _ SpaceAfter=No 8 . . PUNCT _ _ 3 punct _ _
# visual-style 3 7 obl:během:gen color:blue # visual-style 3 7 obl:před:ins color:blue # visual-style 4 6 conj:a color:blue # text = Lidé se rozutekli před a během útoku. 1 Lidé People NOUN _ Case=Nom 3 nsubj _ _ 2 se themselves PRON _ Case=Acc 3 expl:pv _ _ 3 rozutekli scattered VERB _ _ 0 root _ _ 4 před before ADP _ Case=Ins 7 case _ _ 5 a and CCONJ _ _ 6 cc _ _ 6 během during ADP _ Case=Gen 4 conj:a _ _ 7 útoku attack NOUN _ Case=Gen 3 obl:během:gen 3:obl:před:ins SpaceAfter=No 8 . . PUNCT _ _ 3 punct _ _
# visual-style 1 3 conj color:red # visual-style 1 6 conj color:red 1 apples _ _ _ _ 0 root _ _ 2 and _ _ _ _ 3 cc _ _ 3 bananas _ _ _ _ 1 conj _ SpaceAfter=No 4 , _ _ _ _ 6 punct _ _ 5 or _ _ _ _ 6 cc _ _ 6 oranges _ _ _ _ 1 conj _ _
# visual-style 1 3 conj:and color:blue # visual-style 1 6 conj:or color:blue 1 apples _ _ _ _ 0 root _ _ 2 and _ _ _ _ 3 cc _ _ 3 bananas _ _ _ _ 1 conj:and _ SpaceAfter=No 4 , _ _ _ _ 6 punct _ _ 5 or _ _ _ _ 6 cc _ _ 6 oranges _ _ _ _ 1 conj:or _ _

Additional enhancements

Some postprocessing steps such as demoting light nouns that behave like quantificational determiners (as, for example, described in Schuster and Manning (2016)) can improve the usability of the dependency graphs for downstream applications. However, as most of these additions are highly language-specific, we do not provide any universal guidelines for such a representation and anything beyond the above additions is not part of the UD standard and should not be added to the officially released treebanks.


DZ: Here are some additional thoughts on things that are not part of the officially approved guidelines but I think that they should be considered for addition in the future (based on experience with the treebanks that already contain some enhanced annotation).