Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport
Kelly Marchisio, Ali Saad-Eldin, Kevin Duh, Carey Priebe, Philipp Koehn
Abstract
Bilingual lexicons form a critical component of various natural language processing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval. In this work, we improve bilingual lexicon induction performance across 40 language pairs with a graph-matching method based on optimal transport. The method is especially strong with low amounts of supervision.- Anthology ID:
- 2022.emnlp-main.164
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2545–2561
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.164
- DOI:
- Cite (ACL):
- Kelly Marchisio, Ali Saad-Eldin, Kevin Duh, Carey Priebe, and Philipp Koehn. 2022. Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2545–2561, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport (Marchisio et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.emnlp-main.164.pdf