Abstract
This paper describes the work and the systems submitted by the IIIT-Hyderbad team in the WAT 2021 MultiIndicMT shared task. The task covers 10 major languages of the Indian subcontinent. For the scope of this task, we have built multilingual systems for 20 translation directions namely English-Indic (one-to- many) and Indic-English (many-to-one). Individually, Indian languages are resource poor which hampers translation quality but by leveraging multilingualism and abundant monolingual corpora, the translation quality can be substantially boosted. But the multilingual systems are highly complex in terms of time as well as computational resources. Therefore, we are training our systems by efficiently se- lecting data that will actually contribute to most of the learning process. Furthermore, we are also exploiting the language related- ness found in between Indian languages. All the comparisons were made using BLEU score and we found that our final multilingual sys- tem significantly outperforms the baselines by an average of 11.3 and 19.6 BLEU points for English-Indic (en-xx) and Indic-English (xx- en) directions, respectively.- Anthology ID:
- 2021.wat-1.25
- Volume:
- Proceedings of the 8th Workshop on Asian Translation (WAT2021)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Venue:
- WAT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 212–216
- Language:
- URL:
- https://aclanthology.org/2021.wat-1.25
- DOI:
- 10.18653/v1/2021.wat-1.25
- Cite (ACL):
- Sourav Kumar, Salil Aggarwal, and Dipti Sharma. 2021. IIIT Hyderabad Submission To WAT 2021: Efficient Multilingual NMT systems for Indian languages. In Proceedings of the 8th Workshop on Asian Translation (WAT2021), pages 212–216, Online. Association for Computational Linguistics.
- Cite (Informal):
- IIIT Hyderabad Submission To WAT 2021: Efficient Multilingual NMT systems for Indian languages (Kumar et al., WAT 2021)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2021.wat-1.25.pdf