Dhananjay Ghumare


2025

pdf bib
IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval
Shounak Paul | Dhananjay Ghumare | Pawan Goyal | Saptarshi Ghosh | Ashutosh Modi
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Identifying/retrieving relevant statutes and prior cases/precedents for a given legal situation are common tasks exercised by law practitioners. Researchers till date have addressed the two tasks independently, thus developing completely different datasets and models for each task; however, both retrieval tasks are inherently related, e.g., similar cases tend to cite similar statutes (due to similar factual situation). In this paper, we address this gap. We propose IL-PCSR (Indian Legal corpus for Prior Case and Statute Retrieval), which is a unique corpus that provides a common testbed for developing models for both the tasks (Statute Retrieval and Precedent Retrieval) that can exploit the dependence between the two. We experiment extensively with several baseline models on the tasks, including lexical models, semantic models and ensemble based on GNNs. Further, to exploit the dependence between the two tasks, we develop an LLM based re-ranking approach that gives the best performance.