How Lexical is Bilingual Lexicon Induction?

Harsh Kohli, Helian Feng, Nicholas Dronen, Calvin McCarter, Sina Moeini, Ali Kebarighotbi


Abstract
In contemporary machine learning approaches to bilingual lexicon induction (BLI), a model learns a mapping between the embedding spaces of a language pair. Recently, retrieve-and-rank approach to BLI has achieved state of the art results on the task. However, the problem remains challenging in low-resource settings, due to the paucity of data. The task is complicated by factors such as lexical variation across languages. We argue that the incorporation of additional lexical information into the recent retrieve-and-rank approach should improve lexicon induction. We demonstrate the efficacy of our proposed approach on XLING, improving over the previous state of the art by an average of 2% across all language pairs.
Anthology ID:
2024.findings-naacl.273
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4381–4386
Language:
URL:
https://aclanthology.org/2024.findings-naacl.273
DOI:
10.18653/v1/2024.findings-naacl.273
Bibkey:
Cite (ACL):
Harsh Kohli, Helian Feng, Nicholas Dronen, Calvin McCarter, Sina Moeini, and Ali Kebarighotbi. 2024. How Lexical is Bilingual Lexicon Induction?. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 4381–4386, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
How Lexical is Bilingual Lexicon Induction? (Kohli et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2024.findings-naacl.273.pdf