Quick introduction (test)
This online repository contains dependency annotation documentation and visualizations, built using a combination of Jekyll, GitHub pages and embedded brat visualizations. Here’s a minimal example:
An adjectival complement of a verb is an adjectival phrase which functions as the complement (see ADJ).
She looks very beautiful
acomp(looks, beautiful)
The text is Markdown (with optional inline HTML) and the data for the visualizations is represented in either the Stanford Dependency or CoNLL-X format. For example, the above visualization is generated from this input:
An adjectival complement of a verb is an adjectival phrase which functions as the complement (see [ADJ]()).
~~~ sdparse
She looks very beautiful
acomp(looks, beautiful)
~~~
See the links below for more information.
How to contribute
See here for instructions on how to contribute to this online documentation. See below for details on how the visualized examples are created.
More information:
The following sources of documentation provide further details:
Further details are provided below.
Automatic parse visualization
Simple examples
A single tree in the Stanford Dependency format can be embedded using the following syntax:
~~~ sdparse
Dogs run
nsubj(run, Dogs)
~~~
which results in this embedded visualization:
Dogs run
nsubj(run, Dogs)
The CoNLL-X format is also supported. For example,
~~~ conllx
1 Dogs dog _ NNS _ 2 nsubj
2 run run _ VBP _ 0 ROOT
~~~
gives
1 Dogs dog _ NNS _ 2 nsubj
2 run run _ VBP _ 0 ROOT
Similarly, the new CoNLL-U format is now supported as well:
~~~ conllu
# I wrote the letter with a quill.
1 Я ja PRON _ Case=Nom|Number=Sing|Person=1|PronType=Prs 2 nsubj _ I
2 написал napisat' VERB _ Gender=Masc|Number=Sing|VerbForm=Part|Voice=Act 0 root _ wrote
3 письмо pis'mo NOUN _ Case=Acc|Gender=Neut|Number=Sing 2 obj _ the-letter
4 пером pero NOUN _ Case=Ins|Gender=Neut|Number=Sing 2 nmod _ with-a-quill
~~~
# I wrote the letter with a quill.
1 Я ja PRON _ Case=Nom|Number=Sing|Person=1|PronType=Prs 2 nsubj _ I
2 написал napisat' VERB _ Gender=Masc|Number=Sing|VerbForm=Part|Voice=Act 0 root _ wrote
3 письмо pis'mo NOUN _ Case=Acc|Gender=Neut|Number=Sing 2 obj _ the-letter
4 пером pero NOUN _ Case=Ins|Gender=Neut|Number=Sing 2 nmod _ with-a-quill
The comment lines in the CoNLL-U format can be used to change styles, e.g. to highlight a node / arc or to distinguish tree that does not adhere to the UD standard:
~~~ conllu
# This is not UD, it is Prague Dependency Treebank, and we want to clearly distinguish it from the UD examples.
# visual-style nodes yellow
# visual-style arcs blue
1 Na na ADP _ _ 4 AuxP _ at
2 Hlavním hlavní ADJ _ _ 3 Atr _ Main
3 nádraží nádraží NOUN _ _ 1 Adv _ Station
4 došlo dojít VERB _ _ 0 Pred _ there-was
5 k k ADP _ _ 4 AuxP _ to
6 nehodě nehoda NOUN _ _ 5 Obj _ accident
7 . . PUNCT _ _ 0 AuxK _ .
~~~
# This is not UD, it is Prague Dependency Treebank, and we want to clearly distinguish it from the UD examples.
# visual-style nodes yellow
# visual-style arcs blue
1 Na na ADP _ _ 4 AuxP _ at
2 Hlavním hlavní ADJ _ _ 3 Atr _ Main
3 nádraží nádraží NOUN _ _ 1 Adv _ Station
4 došlo dojít VERB _ _ 0 Pred _ there-was
5 k k ADP _ _ 4 AuxP _ to
6 nehodě nehoda NOUN _ _ 5 Obj _ accident
7 . . PUNCT _ _ 0 AuxK _ .
(See also http://spyysalo.github.io/conllu.js/.)
You can have any number of visualizations on a page, and any standard HTML content can be freely mixed with the visualizations.
Alternative form
As an alternative to the ~~~
syntax, you can use the equivalent HTML
tag form:
<div class="sd-parse">
Dogs run
nsubj(run, Dogs)
</div>
This form is more flexible in allowing e.g. additional attributes to control aspects of the visualization. For example,
<div class="sd-parse" id="simple-example-parse" tabs="yes">
Dogs run
nsubj(run, Dogs)
</div>
gives
Ambiguous tokens
If your example has several instances of the same token, you can use their position to refer to the exact token. In the following example can-5
refers to the fifth token of the sentence, can
.
~~~ sdparse
I can can the can .
nsubj(can-3, I)
aux(can-3, can-2)
det(can-5,the)
obj(can-3,can-5)
punct(can-3,.)
~~~
will result in this visualization
I can can the can .
nsubj(can-3, I)
aux(can-3, can-2)
det(can-5,the)
obj(can-3,can-5)
punct(can-3,.)
POS tags
POS tags are optional and use the format “text/POS”.
~~~ sdparse
POS/NNP tags/NNS can/MD be/VB attached/VBN to/TO ( any part of ) the/DT sentence/NN text/NN ./.
dep(tags-2, POS-1)
nsubjpass(attached-5, tags-2)
aux(attached-5, can-3)
auxpass(attached-5, be-4)
prep(attached-5, to-6)
det(text-14, the-12)
nn(text-14, sentence-13)
pobj(to-6, text-14)
det(part, any)
prep(part, of)
~~~
POS/NNP tags/NNS can/MD be/VB attached/VBN to/TO ( any part of ) the/DT sentence/NN text/NN ./.
dep(tags-2, POS-1)
nsubjpass(attached-5, tags-2)
aux(attached-5, can-3)
auxpass(attached-5, be-4)
prep(attached-5, to-6)
det(text-14, the-12)
nn(text-14, sentence-13)
pobj(to-6, text-14)
det(part, any)
prep(part, of)
Any literal slashes (“/”) can be escaped using backslash.
~~~ sdparse
\\/\\ escapes/VBZ :/: \\o\//\\o\/
nsubj(escapes, \)
~~~
\\/\\ escapes/VBZ :/: \\o\//\\o\/
nsubj(escapes, \)
Features
Features can be specified using the syntax “text/POS[Name=Value]”. Multiple features for a single word are separated by a bar (“|”).
~~~ sdparse
Token/TAG[Feat1=Val1|Feat2=Val2]
~~~
Token/TAG[Feat1=Val1|Feat2=Val2]
Multiple lines of text
The literal sequence \n
in the SD input text is interpreted as a
newline. (This sequence should be separated by space from the rest of
the input.)
~~~ sdparse
One line \n and another.
~~~
gives:
One line \n and another.
Editing
Controls for visualization editing and information is accessible in
elements with the attribute tabs="yes"
(or any other non-empty
value):
<div class="sd-parse" id="simple-example-parse" tabs="yes">
Dogs run
nsubj(run, Dogs)
</div>
This gives:
You can click on the tab on the top right to edit the visualization, but note that the edits are not saved anywhere as there’s no server. This is mostly useful to build and debug examples.
Unicode
Everything is unicode-compliant.
~~~ sdparse
ロボットは 東大に 入れる か 。
nsubj(入れる, ロボットは)
nommod(入れる, 東大に)
~~~
ロボットは 東大に 入れる か 。
nsubj(入れる, ロボットは)
nommod(入れる, 東大に)
Links to documentation
The system supports a simplified syntax for linking documentation pages that are part of a collection (e.g. universal dependency types, POS tags, etc.).
The basic syntax is [COLL/DOC]()
, where COLL
is the collection
name and DOC
the document title. For example, [u-dep/aux]()
is
linked as follows: u-dep/aux.
The shorter form [DOC]()
(omitting the collection) can be used when
referring to another document in the current collection (e.g. linking
between different documents in the u-dep
collection) or when the
document title is unique. For example, [nmod:own]()
can be assured
to link to nmod:own as the type is unique to the Finnish
annotation.