Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport

Kelly Marchisio, Ali Saad-Eldin, Kevin Duh, Carey Priebe, Philipp Koehn


Abstract
Bilingual lexicons form a critical component of various natural language processing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval. In this work, we improve bilingual lexicon induction performance across 40 language pairs with a graph-matching method based on optimal transport. The method is especially strong with low amounts of supervision.
Anthology ID:
2022.emnlp-main.164
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2545–2561
Language:
URL:
https://aclanthology.org/2022.emnlp-main.164
DOI:
Bibkey:
Cite (ACL):
Kelly Marchisio, Ali Saad-Eldin, Kevin Duh, Carey Priebe, and Philipp Koehn. 2022. Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2545–2561, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport (Marchisio et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-ingestion/2022.emnlp-main.164.pdf