Abstract
The Universal Dependencies treebanks are a still-growing collection of treebanks for a wide range of languages, all annotated with a common inventory of dependency relations. Yet, the usages of the relations can be categorically different even for treebanks of the same language. We present a pilot study on identifying such inconsistencies in a language-independent way and conduct an experiment which illustrates that a proper handling of inconsistencies can improve parsing performance by several percentage points.- Anthology ID:
- 2020.udw-1.8
- Volume:
- Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Venue:
- UDW
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 67–75
- Language:
- URL:
- https://aclanthology.org/2020.udw-1.8
- DOI:
- Cite (ACL):
- Tillmann Dönicke, Xiang Yu, and Jonas Kuhn. 2020. Identifying and Handling Cross-Treebank Inconsistencies in UD: A Pilot Study. In Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020), pages 67–75, Barcelona, Spain (Online). Association for Computational Linguistics.
- Cite (Informal):
- Identifying and Handling Cross-Treebank Inconsistencies in UD: A Pilot Study (Dönicke et al., UDW 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.udw-1.8.pdf
- Code
- tidoe/typology-coling