Towards Sense-level Bilingual Dictionary Induction

Lydia Körber, Katja Markert, Wei Zhao


Abstract
Updating bilingual dictionary entries is a tedious, time-consuming, and highly subjective task, especially when a new sense in the source language requires identifying an appropriate translation equivalent. To date, there have been no attempts to automatize the discovery of new bilingual sense entries. Related tasks such as Word-level Bilingual Dictionary Induction and cross-lingual embedding alignment do not account for polysemy and are not applied to lexicographic data. In contrast to their monolingual counterparts, bilingual dictionaries fall short in terms of completeness, detail with respect to examples and glosses, and diachronic information. We introduce a novel NLP task, Sense-Level Bilingual Dictionary Induction (SenseBDI), at the intersection of lexical semantics, cross-lingual, and diachronic NLP. We construct a dataset of time-stamped sense-level bilingual dictionary entries by aligning two bilingual dictionaries, two monolingual dictionaries, and the multilingual resource BabelNet, thereby enriching bilingual entries with monolingual source-language information. We propose a baseline based on nearest-neighbor search over cross-lingual embeddings of glosses and usages. We find that usages contribute more strongly than glosses, with substantial variation across language pairs and discuss task-specific challenges with regards to target language polysemy and future directions such as transfer to real-world scenarios.
Anthology ID:
2026.starsem-conference.18
Volume:
Proceedings of the 15th Joint Conference on Lexical and Computational Semantics (*SEM 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Saif M. Mohammad, Nedjma Ousidhoum
Venues:
*SEM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
275–289
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.starsem-conference.18/
DOI:
Bibkey:
Cite (ACL):
Lydia Körber, Katja Markert, and Wei Zhao. 2026. Towards Sense-level Bilingual Dictionary Induction. In Proceedings of the 15th Joint Conference on Lexical and Computational Semantics (*SEM 2026), pages 275–289, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Towards Sense-level Bilingual Dictionary Induction (Körber et al., *SEM 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.starsem-conference.18.pdf