Abstract
This paper presents the approach and results of team v036 in the English-to-Low-Resource Multi-Modal Translation Task at the Ninth Conference on Machine Translation (WMT24). Our team tackled the challenge of translating English source text to low-resource Indic languages, specifically Hindi, Malayalam, and Bengali, while leveraging visual context provided alongside the text data. We used InternVL2 for extracting the image context along with Knowledge Distillation from bigger LLMs to train Small Language Model on the tranlsation task. During current shared task phase, we submitted best models (for this task), and overall we got rank 3 on Hindi, Bengali, and Malyalam datasets. We also open source our models on huggingface.- Anthology ID:
- 2024.wmt-1.79
- Volume:
- Proceedings of the Ninth Conference on Machine Translation
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
- Venue:
- WMT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 833–838
- Language:
- URL:
- https://aclanthology.org/2024.wmt-1.79
- DOI:
- 10.18653/v1/2024.wmt-1.79
- Cite (ACL):
- Pawan Rajpoot, Nagaraj Bhat, and Ashish Shrivastava. 2024. Multimodal Machine Translation for Low-Resource Indic Languages: A Chain-of-Thought Approach Using Large Language Models. In Proceedings of the Ninth Conference on Machine Translation, pages 833–838, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Multimodal Machine Translation for Low-Resource Indic Languages: A Chain-of-Thought Approach Using Large Language Models (Rajpoot et al., WMT 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.wmt-1.79.pdf