LePaRD: A Large-Scale Dataset of Judicial Citations to Precedent
Robert Mahari, Dominik Stammbach, Elliott Ash, Alex Pentland
Abstract
We present the Legal Passage Retrieval Dataset, LePaRD. LePaRD contains millions of examples of U.S. federal judges citing precedent in context. The dataset aims to facilitate work on legal passage retrieval, a challenging practice-oriented legal retrieval and reasoning task. Legal passage retrieval seeks to predict relevant passages from precedential court decisions given the context of a legal argument. We extensively evaluate various approaches on LePaRD, and find that classification-based retrieval appears to work best. Our best models only achieve a recall of 59% when trained on data corresponding to the 10,000 most-cited passages, underscoring the difficulty of legal passage retrieval. By publishing LePaRD, we provide a large-scale and high quality resource to foster further research on legal passage retrieval. We hope that research on this practice-oriented NLP task will help expand access to justice by reducing the burden associated with legal research via computational assistance. Warning: Extracts from judicial opinions may contain offensive language.- Anthology ID:
- 2024.acl-long.532
- Volume:
- Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9863–9877
- Language:
- URL:
- https://aclanthology.org/2024.acl-long.532
- DOI:
- 10.18653/v1/2024.acl-long.532
- Cite (ACL):
- Robert Mahari, Dominik Stammbach, Elliott Ash, and Alex Pentland. 2024. LePaRD: A Large-Scale Dataset of Judicial Citations to Precedent. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9863–9877, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- LePaRD: A Large-Scale Dataset of Judicial Citations to Precedent (Mahari et al., ACL 2024)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2024.acl-long.532.pdf