Abstract
This paper presents a methodology for identifying and resolving various kinds of inconsistency in the context of merging dependency and multiword expression (MWE) annotations, to generate a dependency treebank with comprehensive MWE annotations. Candidates for correction are identified using a variety of heuristics, including an entirely novel one which identifies violations of MWE constituency in the dependency tree, and resolved by arbitration with minimal human intervention. Using this technique, we identified and corrected several hundred errors across both parse and MWE annotations, representing changes to a significant percentage (well over 10%) of the MWE instances in the joint corpus.- Anthology ID:
- W17-1726
- Volume:
- Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Stella Markantonatou, Carlos Ramisch, Agata Savary, Veronika Vincze
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 187–193
- Language:
- URL:
- https://aclanthology.org/W17-1726
- DOI:
- 10.18653/v1/W17-1726
- Cite (ACL):
- King Chan, Julian Brooke, and Timothy Baldwin. 2017. Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation. In Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017), pages 187–193, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation (Chan et al., MWE 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/W17-1726.pdf
- Code
- eltimster/HAMSTER
- Data
- English Web Treebank, Universal Dependencies