From Scratch to Fine-Tuned: A Comparative Study of Transformer Training Strategies for Legal Machine Translation

Amit Barman, Atanu Mandal, Sudip Kumar Naskar


Abstract
In multilingual nations like India, access to legal information is often hindered by language barriers, as much of the legal and judicial documentation remains in English. Legal Machine Translation (L-MT) offers a scalable solution to this challenge by enabling accurate and accessible translations of legal documents. This paper presents our work for the JUST-NLP 2025 Legal MT shared task, focusing on English–Hindi translation using Transformer-based approaches. We experiment with 2 complementary strategies, fine-tuning a pre-trained OPUS-MT model for domain-specific adaptation and training a Transformer model from scratch using the provided legal corpus. Performance is evaluated using standard MT metrics, including SacreBLEU, chrF++, TER, ROUGE, BERTScore, METEOR, and COMET. Our fine-tuned OPUS-MT model achieves a SacreBLEU score of 46.03, significantly outperforming both baseline and from-scratch models. The results highlight the effectiveness of domain adaptation in enhancing translation quality and demonstrate the potential of L-MT systems to improve access to justice and legal transparency in multilingual contexts.
Anthology ID:
2025.justnlp-main.20
Volume:
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Ashutosh Modi, Saptarshi Ghosh, Asif Ekbal, Pawan Goyal, Sarika Jain, Abhinav Joshi, Shivani Mishra, Debtanu Datta, Shounak Paul, Kshetrimayum Boynao Singh, Sandeep Kumar
Venues:
JUSTNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
179–185
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.justnlp-main.20/
DOI:
Bibkey:
Cite (ACL):
Amit Barman, Atanu Mandal, and Sudip Kumar Naskar. 2025. From Scratch to Fine-Tuned: A Comparative Study of Transformer Training Strategies for Legal Machine Translation. In Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025), pages 179–185, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):
From Scratch to Fine-Tuned: A Comparative Study of Transformer Training Strategies for Legal Machine Translation (Barman et al., JUSTNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.justnlp-main.20.pdf