Known Words Will Do: Unknown Concept Translation via Lexical Relations

Winston Wu, David Yarowsky


Abstract
Translating into low-resource languages is challenging due to the scarcity of training data. In this paper, we propose a probabilistic lexical translation method that bridges through lexical relations including synonyms, hypernyms, hyponyms, and co-hyponyms. This method, which only requires a dictionary like Wiktionary and a lexical database like WordNet, enables the translation of unknown vocabulary into low-resource languages for which we may only know the translation of a related concept. Experiments on translating a core vocabulary set into 472 languages, most of them low-resource, show the effectiveness of our approach.
Anthology ID:
2022.loresmt-1.3
Volume:
Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2022)
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Atul Kr. Ojha, Chao-Hong Liu, Ekaterina Vylomova, Jade Abbott, Jonathan Washington, Nathaniel Oco, Tommi A Pirinen, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venue:
LoResMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15–22
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/2022.loresmt-1.3/
DOI:
Bibkey:
Cite (ACL):
Winston Wu and David Yarowsky. 2022. Known Words Will Do: Unknown Concept Translation via Lexical Relations. In Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2022), pages 15–22, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
Known Words Will Do: Unknown Concept Translation via Lexical Relations (Wu & Yarowsky, LoResMT 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/2022.loresmt-1.3.pdf