2025
pdf
bib
abs
IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval
Shounak Paul
|
Dhananjay Ghumare
|
Pawan Goyal
|
Saptarshi Ghosh
|
Ashutosh Modi
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Identifying/retrieving relevant statutes and prior cases/precedents for a given legal situation are common tasks exercised by law practitioners. Researchers till date have addressed the two tasks independently, thus developing completely different datasets and models for each task; however, both retrieval tasks are inherently related, e.g., similar cases tend to cite similar statutes (due to similar factual situation). In this paper, we address this gap. We propose IL-PCSR (Indian Legal corpus for Prior Case and Statute Retrieval), which is a unique corpus that provides a common testbed for developing models for both the tasks (Statute Retrieval and Precedent Retrieval) that can exploit the dependence between the two. We experiment extensively with several baseline models on the tasks, including lexical models, semantic models and ensemble based on GNNs. Further, to exploit the dependence between the two tasks, we develop an LLM based re-ranking approach that gives the best performance.
pdf
bib
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
Ashutosh Modi
|
Saptarshi Ghosh
|
Asif Ekbal
|
Pawan Goyal
|
Sarika Jain
|
Abhinav Joshi
|
Shivani Mishra
|
Debtanu Datta
|
Shounak Paul
|
Kshetrimayum Boynao Singh
|
Sandeep Kumar
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
pdf
bib
abs
Overview of the 1st Workshop on NLP for Empowering Justice
Ashutosh Modi
|
Saptarshi Ghosh
|
Asif Ekbal
|
Pawan Goyal
|
Sarika Jain
|
Abhinav Joshi
|
Shivani Mishra
|
Debtanu Datta
|
Shounak Paul
|
Kshetrimayum Boynao Singh
|
Sandeep Kumar
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
The first iteration of the JUST-NLP: Workshop on NLP for Empowering Justice was organized to accelerate research in Natural Language Processing for legal text processing. The inaugural edition, JUST-NLP 2025, was held as a hybrid event at IJCNLP-AACL 2025 on December 24 at IIT Bombay. The program featured a research track, four invited talks, and two shared tasks: (1) L-SUMM, an abstractive summarization task for Indian legal judgments, and (2) L-MT, a legal machine translation task between English and Hindi. The workshop received strong interest from the community, with 29 submissions, of which 21 were accepted. Among the accepted papers, 5 were regular research-track papers published in the proceedings, and 2 were accepted as non-archival presentations. For the shared tasks, 9 papers were accepted for L-SUMM, and 5 papers were accepted for L-MT, for publication in the proceedings. The workshop focused on a broad set of Legal NLP challenges, including information extraction, retrieval, multilingual processing, legal reasoning, and applications of large language models. Overall, JUST-NLP 2025 aimed to bring together AI researchers and legal practitioners to develop scalable, domain-aware NLP methods that can support legal workflows and contribute toward more efficient and equitable justice systems.
pdf
bib
abs
Findings of the JUST-NLP 2025 Shared Task on Summarization of Indian Court Judgments
Debtanu Datta
|
Shounak Paul
|
Kshetrimayum Boynao Singh
|
Sandeep Kumar
|
Abhinav Joshi
|
Shivani Mishra
|
Sarika Jain
|
Asif Ekbal
|
Pawan Goyal
|
Ashutosh Modi
|
Saptarshi Ghosh
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
This paper presents an overview of the Shared Task on Summarization of Indian Court Judgments (L-SUMM), hosted by the JUST-NLP 2025 Workshop at IJCNLP-AACL 2025. This task aims to increase research interest in automatic summarization techniques for lengthy and intricate legal documents from the Indian judiciary. It particularly addresses court judgments that contain dense legal reasoning and semantic roles that must be preserved in summaries. As part of this shared task, we introduce the Indian Legal Summarization (L-SUMM) dataset, comprising 1,800 Indian court judgments paired with expert-written abstractive summaries, both in English. Therefore, the task focuses on generating high-quality abstractive summaries of court judgments in English. A total of 9 teams participated in this task, exploring a diverse range of methodologies, including transformer-based models, extractive-abstractive hybrids, graph-based ranking approaches, long-context LLMs, and rhetorical-role-based techniques. This paper describes the task setup, dataset, evaluation framework, and our findings. We report the results and highlight key trends across participant approaches, including the effectiveness of hybrid pipelines and challenges in handling extreme sequence lengths.
pdf
bib
abs
Findings of the JUST-NLP 2025 Shared Task on English-to-Hindi Legal Machine Translation
Kshetrimayum Boynao Singh
|
Sandeep Kumar
|
Debtanu Datta
|
Abhinav Joshi
|
Shivani Mishra
|
Shounak Paul
|
Pawan Goyal
|
Sarika Jain
|
Saptarshi Ghosh
|
Ashutosh Modi
|
Asif Ekbal
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
This paper provides an overview of the Shared Task on Legal Machine Translation (L-MT), organized as part of the JUST-NLP 2025 Workshop at IJCNLP-AACL 2025, aimed at improving the translation of legal texts, a domain where precision, structural faithfulness, and terminology preservation are essential. The training set comprises 50,000 sentences, with 5,000 sentences each for the validation and test sets. The submissions employed strategies such as: domain-adaptive fine-tuning of multilingual models, QLoRA-based parameter-efficient adaptation, curriculum-guided supervised training, reinforcement learning with verifiable MT metrics, and from-scratch Transformer training. The systems are evaluated based on BLEU, METEOR, TER, chrF++, BERTScore, and COMET metrics. We also combine the scores of these metrics to give an average score (AutoRank). The top-performing system is based on a fine-tuned distilled NLLB-200 model and achieved the highest AutoRank score of 72.1. Domain adaptation consistently yielded substantial improvements over baseline models, and precision-focused rewards proved especially effective for the legal MT. The findings also highlight that large multilingual Transformers can deliver accurate and reliable English-to-Hindi legal translations when carefully fine-tuned on legal data, advancing the broader goal of improving access to justice in multilingual settings.
2024
pdf
bib
abs
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning
Abhinav Joshi
|
Shounak Paul
|
Akshat Sharma
|
Pawan Goyal
|
Saptarshi Ghosh
|
Ashutosh Modi
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Legal systems worldwide are inundated with exponential growth in cases and documents. There is an imminent need to develop NLP and ML techniques for automatically processing and understanding legal documents to streamline the legal system. However, evaluating and comparing various NLP models designed specifically for the legal domain is challenging. This paper addresses this challenge by proposing : Benchmark for Indian Legal Text Understanding and Reasoning. contains monolingual (English, Hindi) and multi-lingual (9 Indian languages) domain-specific tasks that address different aspects of the legal system from the point of view of understanding and reasoning over Indian legal documents. We present baseline models (including LLM-based) for each task, outlining the gap between models and the ground truth. To foster further research in the legal domain, we create a leaderboard (available at: https://exploration-lab.github.io/IL-TUR/ ) where the research community can upload and compare legal text understanding systems.
2020
pdf
bib
abs
Automatic Charge Identification from Facts: A Few Sentence-Level Charge Annotations is All You Need
Shounak Paul
|
Pawan Goyal
|
Saptarshi Ghosh
Proceedings of the 28th International Conference on Computational Linguistics
Automatic Charge Identification (ACI) is the task of identifying the relevant charges given the facts of a situation and the statutory laws that define these charges, and is a crucial aspect of the judicial process. Existing works focus on learning charge-side representations by modeling relationships between the charges, but not much effort has been made in improving fact-side representations. We observe that only a small fraction of sentences in the facts actually indicates the charges. We show that by using a very small subset (< 3%) of fact descriptions annotated with sentence-level charges, we can achieve an improvement across a range of different ACI models, as compared to modeling just the main document-level task on a much larger dataset. Additionally, we propose a novel model that utilizes sentence-level charge labels as an auxiliary task, coupled with the main task of document-level charge identification in a multi-task learning framework. The proposed model comprehensively outperforms a large number of recent baselines for ACI. The improvement in performance is particularly noticeable for the rare charges which are known to be especially challenging to identify.