Advantages of a Complex Multilayer Annotation Scheme: The Case of the Prague Dependency Treebank

Eva Hajicova, Marie Mikulová, Barbora Štěpánková, Jiří Mírovský


Abstract
Recently, many corpora have been developed that contain multiple annotations of various linguistic phenomena, from morphological categories of words through the syntactic structure of sentences to discourse and coreference relations in texts. Discussions are ongoing on an appropriate annotation scheme for a large amount of diverse information. In our contribution we express our conviction that a multilayer annotation scheme offers to view the language system in its complexity and in the interaction of individual phenomena and that there are at least two aspects that support such a scheme: (i) A multilayer annotation scheme makes it possible to use the annotation of one layer to design the annotation of another layer(s) both conceptually and in a form of a pre-annotation procedure or annotation checking rules. (ii) A multilayer annotation scheme presents a reliable ground for corpus studies based on features across the layers. These aspects are demonstrated on the case of the Prague Dependency Treebank. Its multilayer annotation scheme withstood the test of time and serves well also for complex textual annotations, in which earlier morpho-syntactic annotations are advantageously used. In addition to a reference to the previous projects that utilise its annotation scheme, we present several current investigations.
Anthology ID:
2022.law-1.8
Volume:
Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LAW
SIG:
SIGANN
Publisher:
European Language Resources Association
Note:
Pages:
70–78
Language:
URL:
https://aclanthology.org/2022.law-1.8
DOI:
Bibkey:
Cite (ACL):
Eva Hajicova, Marie Mikulová, Barbora Štěpánková, and Jiří Mírovský. 2022. Advantages of a Complex Multilayer Annotation Scheme: The Case of the Prague Dependency Treebank. In Proceedings of the 16th Linguistic Annotation Workshop (LAW-XVI) within LREC2022, pages 70–78, Marseille, France. European Language Resources Association.
Cite (Informal):
Advantages of a Complex Multilayer Annotation Scheme: The Case of the Prague Dependency Treebank (Hajicova et al., LAW 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2022.law-1.8.pdf