Abstract
Inducing multilingual word embeddings by learning a linear map between embedding spaces of different languages achieves remarkable accuracy on related languages. However, accuracy drops substantially when translating between distant languages. Given that languages exhibit differences in vocabulary, grammar, written form, or syntax, one would expect that embedding spaces of different languages have different structures especially for distant languages. With the goal of capturing such differences, we propose a method for learning neighborhood sensitive maps, NORMA. Our experiments show that NORMA outperforms current state-of-the-art methods for word translation between distant languages.- Anthology ID:
- D18-1047
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 512–522
- Language:
- URL:
- https://aclanthology.org/D18-1047
- DOI:
- 10.18653/v1/D18-1047
- Cite (ACL):
- Ndapa Nakashole. 2018. NORMA: Neighborhood Sensitive Maps for Multilingual Word Embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 512–522, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- NORMA: Neighborhood Sensitive Maps for Multilingual Word Embeddings (Nakashole, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/D18-1047.pdf