Abstract
In this paper, we present MultiVitaminBooster, a system implemented for the PARSEME shared task on semi-supervised identification of verbal multiword expressions - edition 1.2. For our approach, we interpret detecting verbal multiword expressions as a token classification task aiming to decide whether a token is part of a verbal multiword expression or not. For this purpose, we train gradient boosting-based models. We encode tokens as feature vectors combining multilingual contextualized word embeddings provided by the XLM-RoBERTa language model with a more traditional linguistic feature set relying on context windows and dependency relations. Our system was ranked 7th in the official open track ranking of the shared task evaluations with an encoding-related bug distorting the results. For this reason we carry out further unofficial evaluations. Unofficial versions of our systems would have achieved higher ranks.- Anthology ID:
- 2020.mwe-1.20
- Volume:
- Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons
- Month:
- December
- Year:
- 2020
- Address:
- online
- Editors:
- Stella Markantonatou, John McCrae, Jelena Mitrović, Carole Tiberius, Carlos Ramisch, Ashwini Vaidya, Petya Osenova, Agata Savary
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 149–155
- Language:
- URL:
- https://aclanthology.org/2020.mwe-1.20
- DOI:
- Cite (ACL):
- Sebastian Gombert and Sabine Bartsch. 2020. MultiVitaminBooster at PARSEME Shared Task 2020: Combining Window- and Dependency-Based Features with Multilingual Contextualised Word Embeddings for VMWE Detection. In Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, pages 149–155, online. Association for Computational Linguistics.
- Cite (Informal):
- MultiVitaminBooster at PARSEME Shared Task 2020: Combining Window- and Dependency-Based Features with Multilingual Contextualised Word Embeddings for VMWE Detection (Gombert & Bartsch, MWE 2020)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2020.mwe-1.20.pdf