Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation
Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Masao Utiyama, Eiichiro Sumita
Abstract
In this study, linguistic knowledge at different levels are incorporated into the neural machine translation (NMT) framework to improve translation quality for language pairs with extremely limited data. Integrating manually designed or automatically extracted features into the NMT framework is known to be beneficial. However, this study emphasizes that the relevance of the features is crucial to the performance. Specifically, we propose two methods, 1) self relevance and 2) word-based relevance, to improve the representation of features for NMT. Experiments are conducted on translation tasks from English to eight Asian languages, with no more than twenty thousand sentences for training. The proposed methods improve translation quality for all tasks by up to 3.09 BLEU points. Discussions with visualization provide the explainability of the proposed methods where we show that the relevance methods provide weights to features thereby enhancing their impact on low-resource machine translation.- Anthology ID:
- 2020.coling-main.376
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 4263–4274
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.376
- DOI:
- 10.18653/v1/2020.coling-main.376
- Cite (ACL):
- Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Masao Utiyama, and Eiichiro Sumita. 2020. Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 4263–4274, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation (Chakrabarty et al., COLING 2020)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2020.coling-main.376.pdf