MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction

Pengcheng Yang, Fuli Luo, Peng Chen, Tianyu Liu, Xu Sun


Abstract
The task of unsupervised bilingual lexicon induction (UBLI) aims to induce word translations from monolingual corpora in two languages. Previous work has shown that morphological variation is an intractable challenge for the UBLI task, where the induced translation in failure case is usually morphologically related to the correct translation. To tackle this challenge, we propose a morphology-aware alignment model for the UBLI task. The proposed model aims to alleviate the adverse effect of morphological variation by introducing grammatical information learned by the pre-trained denoising language model. Results show that our approach can substantially outperform several state-of-the-art unsupervised systems, and even achieves competitive performance compared to supervised methods.
Anthology ID:
P19-1308
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3190–3196
Language:
URL:
https://aclanthology.org/P19-1308
DOI:
10.18653/v1/P19-1308
Bibkey:
Cite (ACL):
Pengcheng Yang, Fuli Luo, Peng Chen, Tianyu Liu, and Xu Sun. 2019. MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3190–3196, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction (Yang et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-url/P19-1308.pdf