Abstract
Contextualized word embeddings, such as ELMo, provide meaningful representations for words and their contexts. They have been shown to have a great impact on downstream applications. However, we observe that the contextualized embeddings of a word might change drastically when its contexts are paraphrased. As these embeddings are over-sensitive to the context, the downstream model may make different predictions when the input sentence is paraphrased. To address this issue, we propose a post-processing approach to retrofit the embedding with paraphrases. Our method learns an orthogonal transformation on the input space of the contextualized word embedding model, which seeks to minimize the variance of word representations on paraphrased contexts. Experiments show that the proposed method significantly improves ELMo on various sentence classification and inference tasks.- Anthology ID:
- D19-1113
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1198–1203
- Language:
- URL:
- https://aclanthology.org/D19-1113
- DOI:
- 10.18653/v1/D19-1113
- Cite (ACL):
- Weijia Shi, Muhao Chen, Pei Zhou, and Kai-Wei Chang. 2019. Retrofitting Contextualized Word Embeddings with Paraphrases. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1198–1203, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Retrofitting Contextualized Word Embeddings with Paraphrases (Shi et al., EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/D19-1113.pdf
- Data
- MPQA Opinion Corpus, SQuAD, SST, SST-2