Learning a mapping between word embeddings of two languages given a dictionary is an important problem with several applications. A common mapping approach is using an orthogonal matrix. The Orthogonal Procrustes Analysis (PA) algorithm can be applied to find the optimal orthogonal matrix. This solution restricts the expressiveness of the translation model which may result in sub-optimal translations. We propose a natural extension of the PA algorithm that uses multiple orthogonal translation matrices to model the mapping and derive an algorithm to learn these multiple matrices. We achieve better performance in a bilingual word translation task and a cross-lingual word similarity task compared to the single matrix baseline. We also show how multiple matrices can model multiple senses of a word.
Current multilingual word translation methods are focused on jointly learning mappings from each language to a shared space. The actual translation, however, is still performed as an isolated bilingual task. In this study we propose a multilingual translation procedure that uses all the learned mappings to translate a word from one language to another. For each source word, we first search for the most relevant auxiliary languages. We then use the translations to these languages to form an improved representation of the source word. Finally, this representation is used for the actual translation to the target language. Experiments on a standard multilingual word translation benchmark demonstrate that our model outperforms state of the art results.
In this paper we present a novel approach to simultaneously representing multiple languages in a common space. Procrustes Analysis (PA) is commonly used to find the optimal orthogonal word mapping in the bilingual case. The proposed Multi Pairwise Procrustes Analysis (MPPA) is a natural extension of the PA algorithm to multilingual word mapping. Unlike previous PA extensions that require a k-way dictionary, this approach requires only pairwise bilingual dictionaries that are much easier to construct.