A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping

Kamil Pluciński, Mateusz Lango, Michał Zimniewicz


Abstract
In this work, we study the unsupervised cross-lingual word embeddings mapping method presented by Artetxe et al. (2018). First, wesuccessfully reproduced the experiments performed in the original work, finding only minor differences. Furthermore, we verified themethod’s robustness on different embedding representations and new language pairs, particularly these involving Slavic languages likePolish or Czech. We also performed an experimental analysis of the impact of the method’s parameters on the final result. Finally, welooked for an alternative way of initialization, which directly relies on the isometric assumption. Our work confirms the results presentedearlier, at the same time pointing at interesting problems occurring while using the method with different types of embeddings or onless-common language pairs.
Anthology ID:
2020.lrec-1.682
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5555–5562
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.682
DOI:
Bibkey:
Cite (ACL):
Kamil Pluciński, Mateusz Lango, and Michał Zimniewicz. 2020. A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5555–5562, Marseille, France. European Language Resources Association.
Cite (Informal):
A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping (Pluciński et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2020.lrec-1.682.pdf