How Well Do Embedding Models Capture Non-compositionality? A View from Multiword Expressions

Navnita Nandakumar; Timothy Baldwin; Bahar Salehi

doi:10.18653/v1/W19-2004

How Well Do Embedding Models Capture Non-compositionality? A View from Multiword Expressions

Navnita Nandakumar, Timothy Baldwin, Bahar Salehi

[How to correct problems with metadata yourself]

Abstract

In this paper, we apply various embedding methods on multiword expressions to study how well they capture the nuances of non-compositional data. Our results from a pool of word-, character-, and document-level embbedings suggest that Word2vec performs the best, followed by FastText and Infersent. Moreover, we find that recently-proposed contextualised embedding models such as Bert and ELMo are not adept at handling non-compositionality in multiword expressions.

Anthology ID:: W19-2004
Volume:: Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP
Month:: June
Year:: 2019
Address:: Minneapolis, USA
Editors:: Anna Rogers, Aleksandr Drozd, Anna Rumshisky, Yoav Goldberg
Venue:: RepEval
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 27–34
Language:
URL:: https://aclanthology.org/W19-2004
DOI:: 10.18653/v1/W19-2004
Bibkey:
Cite (ACL):: Navnita Nandakumar, Timothy Baldwin, and Bahar Salehi. 2019. How Well Do Embedding Models Capture Non-compositionality? A View from Multiword Expressions. In Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP, pages 27–34, Minneapolis, USA. Association for Computational Linguistics.
Cite (Informal):: How Well Do Embedding Models Capture Non-compositionality? A View from Multiword Expressions (Nandakumar et al., RepEval 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/teach-a-man-to-fish/W19-2004.pdf

PDF Search