Abstract
The majority of multiword expressions can be interpreted as figuratively or literally in different contexts which pose challenges in a number of downstream tasks. Most previous work deals with this ambiguity following the observation that MWEs with different usages occur in distinctly different contexts. Following this insight, we explore the usefulness of contextual embeddings by means of both supervised and unsupervised classification. The results show that in the supervised setting, the state-of-the-art can be substantially improved for all expressions in the experiments. The unsupervised classification, similarly, yields very impressive results, comparing favorably to the supervised classifier for the majority of the expressions. We also show that multilingual contextual embeddings can also be employed for this task without leading to any significant loss in performance; hence, the proposed methodology has the potential to be extended to a number of languages.- Anthology ID:
- 2020.mwe-1.11
- Volume:
- Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons
- Month:
- December
- Year:
- 2020
- Address:
- online
- Editors:
- Stella Markantonatou, John McCrae, Jelena Mitrović, Carole Tiberius, Carlos Ramisch, Ashwini Vaidya, Petya Osenova, Agata Savary
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 85–94
- Language:
- URL:
- https://aclanthology.org/2020.mwe-1.11
- DOI:
- Cite (ACL):
- Murathan Kurfalı and Robert Östling. 2020. Disambiguation of Potentially Idiomatic Expressions with Contextual Embeddings. In Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, pages 85–94, online. Association for Computational Linguistics.
- Cite (Informal):
- Disambiguation of Potentially Idiomatic Expressions with Contextual Embeddings (Kurfalı & Östling, MWE 2020)
- PDF:
- https://preview.aclanthology.org/landing_page/2020.mwe-1.11.pdf