2025
pdf
bib
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
Ashutosh Modi
|
Saptarshi Ghosh
|
Asif Ekbal
|
Pawan Goyal
|
Sarika Jain
|
Abhinav Joshi
|
Shivani Mishra
|
Debtanu Datta
|
Shounak Paul
|
Kshetrimayum Boynao Singh
|
Sandeep Kumar
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
pdf
bib
abs
Overview of the 1st Workshop on NLP for Empowering Justice
Ashutosh Modi
|
Saptarshi Ghosh
|
Asif Ekbal
|
Pawan Goyal
|
Sarika Jain
|
Abhinav Joshi
|
Shivani Mishra
|
Debtanu Datta
|
Shounak Paul
|
Kshetrimayum Boynao Singh
|
Sandeep Kumar
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
The first iteration of the JUST-NLP: Workshop on NLP for Empowering Justice was organized to accelerate research in Natural Language Processing for legal text processing. The inaugural edition, JUST-NLP 2025, was held as a hybrid event at IJCNLP-AACL 2025 on December 24 at IIT Bombay. The program featured a research track, four invited talks, and two shared tasks: (1) L-SUMM, an abstractive summarization task for Indian legal judgments, and (2) L-MT, a legal machine translation task between English and Hindi. The workshop received strong interest from the community, with 29 submissions, of which 21 were accepted. Among the accepted papers, 5 were regular research-track papers published in the proceedings, and 2 were accepted as non-archival presentations. For the shared tasks, 9 papers were accepted for L-SUMM, and 5 papers were accepted for L-MT, for publication in the proceedings. The workshop focused on a broad set of Legal NLP challenges, including information extraction, retrieval, multilingual processing, legal reasoning, and applications of large language models. Overall, JUST-NLP 2025 aimed to bring together AI researchers and legal practitioners to develop scalable, domain-aware NLP methods that can support legal workflows and contribute toward more efficient and equitable justice systems.
pdf
bib
abs
Findings of the JUST-NLP 2025 Shared Task on Summarization of Indian Court Judgments
Debtanu Datta
|
Shounak Paul
|
Kshetrimayum Boynao Singh
|
Sandeep Kumar
|
Abhinav Joshi
|
Shivani Mishra
|
Sarika Jain
|
Asif Ekbal
|
Pawan Goyal
|
Ashutosh Modi
|
Saptarshi Ghosh
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
This paper presents an overview of the Shared Task on Summarization of Indian Court Judgments (L-SUMM), hosted by the JUST-NLP 2025 Workshop at IJCNLP-AACL 2025. This task aims to increase research interest in automatic summarization techniques for lengthy and intricate legal documents from the Indian judiciary. It particularly addresses court judgments that contain dense legal reasoning and semantic roles that must be preserved in summaries. As part of this shared task, we introduce the Indian Legal Summarization (L-SUMM) dataset, comprising 1,800 Indian court judgments paired with expert-written abstractive summaries, both in English. Therefore, the task focuses on generating high-quality abstractive summaries of court judgments in English. A total of 9 teams participated in this task, exploring a diverse range of methodologies, including transformer-based models, extractive-abstractive hybrids, graph-based ranking approaches, long-context LLMs, and rhetorical-role-based techniques. This paper describes the task setup, dataset, evaluation framework, and our findings. We report the results and highlight key trends across participant approaches, including the effectiveness of hybrid pipelines and challenges in handling extreme sequence lengths.
pdf
bib
abs
Findings of the JUST-NLP 2025 Shared Task on English-to-Hindi Legal Machine Translation
Kshetrimayum Boynao Singh
|
Sandeep Kumar
|
Debtanu Datta
|
Abhinav Joshi
|
Shivani Mishra
|
Shounak Paul
|
Pawan Goyal
|
Sarika Jain
|
Saptarshi Ghosh
|
Ashutosh Modi
|
Asif Ekbal
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
This paper provides an overview of the Shared Task on Legal Machine Translation (L-MT), organized as part of the JUST-NLP 2025 Workshop at IJCNLP-AACL 2025, aimed at improving the translation of legal texts, a domain where precision, structural faithfulness, and terminology preservation are essential. The training set comprises 50,000 sentences, with 5,000 sentences each for the validation and test sets. The submissions employed strategies such as: domain-adaptive fine-tuning of multilingual models, QLoRA-based parameter-efficient adaptation, curriculum-guided supervised training, reinforcement learning with verifiable MT metrics, and from-scratch Transformer training. The systems are evaluated based on BLEU, METEOR, TER, chrF++, BERTScore, and COMET metrics. We also combine the scores of these metrics to give an average score (AutoRank). The top-performing system is based on a fine-tuned distilled NLLB-200 model and achieved the highest AutoRank score of 72.1. Domain adaptation consistently yielded substantial improvements over baseline models, and precision-focused rewards proved especially effective for the legal MT. The findings also highlight that large multilingual Transformers can deliver accurate and reliable English-to-Hindi legal translations when carefully fine-tuned on legal data, advancing the broader goal of improving access to justice in multilingual settings.
pdf
bib
abs
Evaluating IndicTrans2 and ByT5 for English–Santali Machine Translation Using the Ol Chiki Script
Kshetrimayum Boynao Singh
|
Asif Ekbal
|
Partha Pakray
Proceedings of the 1st Workshop on Multimodal Models for Low-Resource Contexts and Social Impact (MMLoSo 2025)
In this study, we examine and evaluate two multilingual NMT models, IndicTrans2 and ByT5, for English-Santali bidirectional translation using the Ol Chiki script. The models are trained on the MMLoSo Shared Task dataset, supplemented with public English-Santali resources, and evaluated on the AI4Bharat IN22 and Flores test sets, specifically IN22-Gen and Flores200-dev. IndicTrans2 finetune strongly outperforms ByT5 across both directions. On IN22-Gen, it achieves 26.8 BLEU and 53.9 chrF++ for Santali→English and 7.3 BLEU and 40.3 chrF++ for English→Santali, compared to ByT5’s 5.6 BLEU and 30.2 chrF++ for Santali→English and 2.9 BLEU and 32.6 chrF++ for English→Santali. On the Flores test set, IndicTrans2 finetune achieves 22 BLEU, 49.2 chrF++, and 4.7 BLEU, 32.7 chrF++. Again, it surpasses ByT5. While ByT5’s bytelevel modelling is script-agnostic, it struggles with Santali morphology. IndicTrans2 benefits from multilingual pre-training and script unification.
pdf
bib
abs
Does Vision Still Help? Multimodal Translation with CLIP-Based Image Selection
Deepak Kumar
|
Baban Gain
|
Kshetrimayum Boynao Singh
|
Asif Ekbal
Proceedings of the Twelfth Workshop on Asian Translation (WAT 2025)
Multimodal Machine Translation aims to enhance conventional text-only translation systems by incorporating visual context, typically in the form of images paired with captions. In this work, we present our submission to the WAT 2025 Multimodal Translation Shared Task, which explores the role of visual information in translating English captions into four Indic languages: Hindi, Bengali, Malayalam, and Odia. Our system builds upon the strong multilingual text translation backbone IndicTrans, augmented with a CLIP-based selective visual grounding mechanism. Specifically, we compute cosine similarities between text and image embeddings (both full and cropped regions) and automatically select the most semantically aligned image representation to integrate into the translation model. We observe that overall contribution of visual features is questionable. Our findings reaffirm recent evidence that large multilingual translation models can perform competitively without explicit visual grounding.
pdf
bib
abs
Instruction-Tuned English to Bhojpuri Neural Machine Translation Using Contrastive Preference Optimization
Kshetrimayum Boynao Singh
|
Deepak Kumar
|
Asif Ekbal
Proceedings of the Tenth Conference on Machine Translation
This paper presents an English to Bhojpuri machine translation (MT) system developed for the WMT25 General MT Shared Task. Given the low-resource nature of Bhojpuri, we adopt a two-stage training pipeline: unsupervised pretraining followed by supervised fine-tuning. During pretraining, we use a 300,000-sentence corpus comprising 70% Bhojpuri monolingual data and 30% English data to establish language grounding. The fine-tuning stage utilizes 29,749 bilingual English to Bhojpuri sentence pairs (including training, validation, and test sets). To adapt the system to instruction-following scenarios, we apply a novel optimization strategy: Contrastive Preference Optimization (CPO). This technique enables the model to capture fine-grained translation preferences and maintain semantic fidelity in instruction-tuned settings. We evaluate our system across multiple metrics, demonstrating moderate performance in low-resource MT tasks, particularly in diverse domains such as literary, news, social, and speech.
pdf
bib
abs
Evaluation of LLM for English to Hindi Legal Domain Machine Translation Systems
Kshetrimayum Boynao Singh
|
Deepak Kumar
|
Asif Ekbal
Proceedings of the Tenth Conference on Machine Translation
The study critically examines various Machine Translation systems, particularly focusing on Large Language Models, using the WMT25 Legal Domain Test Suite for translating English into Hindi. It utilizes a dataset of 5,000 sentences designed to capture the complexity of legal texts, based on word frequency ranges from 5 to 54. Each frequency range contains 100 sentences, collectively forming a corpus that spans from simple legal terms to intricate legal provisions. Six metrics were used to evaluate the performance of the system: BLEU, METEOR, TER, CHRF++, BERTScore and COMET. The findings reveal diverse capabilities and limitations of LLM architectures in handling complex legal texts. Notably, Gemini-2.5-Pro, Claude-4 and ONLINE-B topped the performance charts in terms fo human evaluation, showcasing the potential of LLMs for nuanced trans- lation. Despite these advances, the study identified areas for further research, especially in improving robustness, reliability, and explainability for use in critical legal contexts. The study also supports the WMT25 subtask focused on evaluating weaknesses of large language models (LLMs). The dataset and related resources are publicly available at https://github.com/helloboyn/WMT25-TS.
pdf
bib
abs
Tackling Low-Resource NMT with Instruction-Tuned LLaMA: A Study on Kokborok and Bodo
Deepak Kumar
|
Kshetrimayum Boynao Singh
|
Asif Ekbal
Proceedings of the Tenth Conference on Machine Translation
This paper presents a new neural machine translation (NMT) system aimed at low-resource language pairs: English to Kokborok and English to Bodo. The framework leverages the LLaMA3-8B-Instruct model along with LoRA-based parameter-efficient fine-tuning. For translating into Kokborok, the model undergoes an initial continued pre-training phase on a dataset containing 75,000 Kokborok and 25,000 English monolingual sentences, followed by instruction-tuning. This tuning uses a reformulated version of WMT25 dataset, adapted to the Alpaca format to support instructional goals. In the Bodo translation, the model is pre-trained on a more extensive dataset of 350,000 Bodo and 125,000 English sentences, using a similar instruction-tuning approach. LoRA adapters are used to modify the large LLaMA3 model for these low-resource settings. Testing with the WMT25 test dataset reveals modest translation results, highlighting the difficulties in translating for low-resource languages. Translating English to Bodo, the model achieved a BLEU score of 4.38, a TER of 92.5, and a chrF score of 35.4. For English to Kokborok, it yielded scores of 5.59 in chrF, 105.4 in TER, and 0.17 in BLEU. These results underscore the intricacies of the task and highlight the critical need for further data collection, domain-specific adaptations, and improvements in model design to better support underrepresented languages.
2024
pdf
bib
abs
WMT24 System Description for the MultiIndic22MT Shared Task on Manipuri Language
Ningthoujam Justwant Singh
|
Kshetrimayum Boynao Singh
|
Ningthoujam Avichandra Singh
|
Sanjita Phijam
|
Thoudam Doren Singh
Proceedings of the Ninth Conference on Machine Translation
This paper presents a Transformer-based Neural Machine Translation (NMT) system developed by the Centre for Natural Language Processing and the Department of Computer Science and Engineering at the National Institute of Technology Silchar, India (NITS-CNLP) for the MultiIndic22MT 2024 Shared Task. The system focused on the English-Manipuri language pair for the WMT24 shared task. The proposed WMT system shows a BLEU score of 6.4, a chrF score of 28.6, and a chrF++ score of 26.6 on the public test set Indic-Conv dataset. Further, in the public test set Indic-Gen dataset, it achieved a BLEU score of 8.1, a chrF score of 32.1, and a chrF++ score of 29.4 on the English-to-Manipuri translation.
2023
pdf
bib
abs
A comparative study of transformer and transfer learning MT models for English-Manipuri
Kshetrimayum Boynao Singh
|
Ningthoujam Avichandra Singh
|
Loitongbam Sanayai Meetei
|
Ningthoujam Justwant Singh
|
Thoudam Doren Singh
|
Sivaji Bandyopadhyay
Proceedings of the 20th International Conference on Natural Language Processing (ICON)
In this work, we focus on the development of machine translation (MT) models of a lowresource language pair viz. English-Manipuri. Manipuri is one of the eight scheduled languages of the Indian constitution. Manipuri is currently written in two different scripts: one is its original script called Meitei Mayek and the other is the Bengali script. We evaluate the performance of English-Manipuri MT models based on transformer and transfer learning technique. Our MT models are trained using a dataset of 69,065 parallel sentences and validated on 500 sentences. Using 500 test sentences, the English to Manipuri MT models achieved a BLEU score of 19.13 and 29.05 with mT5 and OpenNMT respectively. The results demonstrate that the OpenNMT model significantly outperforms the mT5 model. Additionally, Manipuri to English MT system trained with OpenNMT model reported a BLEU score of 30.90. We also carried out a comparative analysis between the Bengali script and the transliterated Meitei Mayek script for English-Manipuri MT models. This analysis reveals that the transliterated version enhances the MT model performance resulting in a notable +2.35 improvement in the BLEU score.
pdf
bib
abs
NITS-CNLP Low-Resource Neural Machine Translation Systems of English-Manipuri Language Pair
Kshetrimayum Boynao Singh
|
Ningthoujam Avichandra Singh
|
Loitongbam Sanayai Meetei
|
Sivaji Bandyopadhyay
|
Thoudam Doren Singh
Proceedings of the Eighth Conference on Machine Translation
This paper describes the transformer-based Neural Machine translation (NMT) system for the Low-Resource Indic Language Translation task for the English-Manipuri language pair submitted by the Centre for Natural Language Processing in National Institute of Technology Silchar, India (NITS-CNLP) in the WMT 2023 shared task. The model attained an overall BLEU score of 22.75 and 26.92 for the English to Manipuri and Manipuri to English translations respectively. Experimental results for English to Manipuri and Manipuri to English models for character level n-gram F-score (chrF) of 48.35 and 48.64, RIBES of 0.61 and 0.65, TER of 70.02 and 67.62, as well as COMET of 0.70 and 0.66 respectively are reported.