Rupesh Dhakad


2025

pdf bib
Team-SVNIT at JUST-NLP 2025: Domain-Adaptive Fine-Tuning of Multilingual Models for English–Hindi Legal Machine Translation
Rupesh Dhakad | Naveen Kumar | Shrikant Malviya
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)

Translating the sentences between English and Hindi is challenging, especially in the domain of legal documents. The major reason behind the complexity is specialized legal terminology, long and complex sentences, and the accuracy constraint. This paper presents a system developed by Team-SVNIT for the JUST-NLP 2025 shared task on legal machine translation. We fine-tune and compare multiple pretrained multilingual translation models, including the facebook/nllb-200-distilled-1.3B, on a corpus of 50,000 English–Hindi legal sentence pairs provided for the shared task. The training pipeline includes preprocessing, context windows of 512 tokens, and decoding methods to enhance translation quality. The proposed method secured 1st place on the official leaderboard with the AutoRank score of 61.62. We obtained the following scores on various metrics: BLEU 51.61, METEOR 75.80, TER 37.09, CHRF++ 73.29, BERTScore 92.61, and COMET 76.36. These results demonstrate that fine-tuning multilingual models for a domain-specific machine translation task enhances performance. It works better than general multilingual translation systems.