Abstract
Translating into low-resource languages is challenging due to the scarcity of training data. In this paper, we propose a probabilistic lexical translation method that bridges through lexical relations including synonyms, hypernyms, hyponyms, and co-hyponyms. This method, which only requires a dictionary like Wiktionary and a lexical database like WordNet, enables the translation of unknown vocabulary into low-resource languages for which we may only know the translation of a related concept. Experiments on translating a core vocabulary set into 472 languages, most of them low-resource, show the effectiveness of our approach.- Anthology ID:
- 2022.loresmt-1.3
- Volume:
- Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2022)
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Venue:
- LoResMT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 15–22
- Language:
- URL:
- https://aclanthology.org/2022.loresmt-1.3
- DOI:
- Cite (ACL):
- Winston Wu and David Yarowsky. 2022. Known Words Will Do: Unknown Concept Translation via Lexical Relations. In Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2022), pages 15–22, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Cite (Informal):
- Known Words Will Do: Unknown Concept Translation via Lexical Relations (Wu & Yarowsky, LoResMT 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.loresmt-1.3.pdf