Abstract
Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more accurate and complete meta-embeddings of words. We model the meta-embedding learning problem as an autoencoding problem, where we would like to learn a meta-embedding space that can accurately reconstruct all source embeddings simultaneously. Thereby, the meta-embedding space is enforced to capture complementary information in different source embeddings via a coherent common embedding space. We propose three flavours of autoencoded meta-embeddings motivated by different requirements that must be satisfied by a meta-embedding. Our experimental results on a series of benchmark evaluations show that the proposed autoencoded meta-embeddings outperform the existing state-of-the-art meta-embeddings in multiple tasks.- Anthology ID:
- C18-1140
- Volume:
- Proceedings of the 27th International Conference on Computational Linguistics
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Emily M. Bender, Leon Derczynski, Pierre Isabelle
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1650–1661
- Language:
- URL:
- https://aclanthology.org/C18-1140
- DOI:
- Cite (ACL):
- Danushka Bollegala and Cong Bao. 2018. Learning Word Meta-Embeddings by Autoencoding. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1650–1661, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- Learning Word Meta-Embeddings by Autoencoding (Bollegala & Bao, COLING 2018)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/C18-1140.pdf
- Code
- CongBao/AutoencodedMetaEmbedding