Multilingual Factor Analysis
Francisco Vargas, Kamen Brestnichki, Alex Papadopoulos Korfiatis, Nils Hammerla
Abstract
In this work we approach the task of learning multilingual word representations in an offline manner by fitting a generative latent variable model to a multilingual dictionary. We model equivalent words in different languages as different views of the same word generated by a common latent variable representing their latent lexical meaning. We explore the task of alignment by querying the fitted model for multilingual embeddings achieving competitive results across a variety of tasks. The proposed model is robust to noise in the embedding space making it a suitable method for distributed representations learned from noisy corpora.- Anthology ID:
- P19-1170
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Anna Korhonen, David Traum, Lluís Màrquez
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1738–1750
- Language:
- URL:
- https://aclanthology.org/P19-1170
- DOI:
- 10.18653/v1/P19-1170
- Cite (ACL):
- Francisco Vargas, Kamen Brestnichki, Alex Papadopoulos Korfiatis, and Nils Hammerla. 2019. Multilingual Factor Analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1738–1750, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Multilingual Factor Analysis (Vargas et al., ACL 2019)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/P19-1170.pdf
- Code
- Babylonpartners/MultilingualFactorAnalysis