Abstract
The technological bridges between knowledge graphs and natural language processing are of utmost importance for the future development of language technology. CoNLL-RDF is a technology that provides such a bridge for popular one-word-per-line formats as widely used in NLP (e.g., the CoNLL Shared Tasks), annotation (Universal Dependencies, Unimorph), corpus linguistics (Corpus WorkBench, CWB) and digital lexicography (SketchEngine): Every empty-line separated table (usually a sentence) is parsed into an graph, can be freely manipulated and enriched using W3C-standardized RDF technology, and then be serialized back into in a TSV format, RDF or other formats. An important limitation is that CoNLL-RDF provides native support for word-level annotations only. This does include dependency syntax and semantic role annotations, but neither phrase structures nor text structure. We describe the extension of the CoNLL-RDF technology stack for two vocabulary extensions of CoNLL-TSV, the PTB bracket notation used in earlier CoNLL Shared Tasks and the extension with XML markup elements featured by CWB and SketchEngine. In order to represent the necessary extensions of the CoNLL vocabulary in an adequate fashion, we employ the POWLA vocabulary for representing and navigating in tree structures.- Anthology ID:
- 2020.lrec-1.885
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 7161–7169
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.885
- DOI:
- Cite (ACL):
- Christian Chiarcos and Luis Glaser. 2020. A Tree Extension for CoNLL-RDF. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 7161–7169, Marseille, France. European Language Resources Association.
- Cite (Informal):
- A Tree Extension for CoNLL-RDF (Chiarcos & Glaser, LREC 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2020.lrec-1.885.pdf