@inproceedings{singh-etal-2025-evaluation,
title = "Evaluation of {LLM} for {E}nglish to {H}indi Legal Domain Machine Translation Systems",
author = "Singh, Kshetrimayum Boynao and
Kumar, Deepak and
Ekbal, Asif",
editor = "Haddow, Barry and
Kocmi, Tom and
Koehn, Philipp and
Monz, Christof",
booktitle = "Proceedings of the Tenth Conference on Machine Translation",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.wmt-1.57/",
doi = "10.18653/v1/2025.wmt-1.57",
pages = "823--833",
ISBN = "979-8-89176-341-8",
abstract = "The study critically examines various Machine Translation systems, particularly focusing on Large Language Models, using the WMT25 Legal Domain Test Suite for translating English into Hindi. It utilizes a dataset of 5,000 sentences designed to capture the complexity of legal texts, based on word frequency ranges from 5 to 54. Each frequency range contains 100 sentences, collectively forming a corpus that spans from simple legal terms to intricate legal provisions. Six metrics were used to evaluate the performance of the system: BLEU, METEOR, TER, CHRF++, BERTScore and COMET. The findings reveal diverse capabilities and limitations of LLM architectures in handling complex legal texts. Notably, Gemini-2.5-Pro, Claude-4 and ONLINE-B topped the performance charts in terms fo human evaluation, showcasing the potential of LLMs for nuanced trans- lation. Despite these advances, the study identified areas for further research, especially in improving robustness, reliability, and explainability for use in critical legal contexts. The study also supports the WMT25 subtask focused on evaluating weaknesses of large language models (LLMs). The dataset and related resources are publicly available at https://github.com/helloboyn/WMT25-TS."
}Markdown (Informal)
[Evaluation of LLM for English to Hindi Legal Domain Machine Translation Systems](https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.wmt-1.57/) (Singh et al., WMT 2025)
ACL