Robert Mahari

2024

Making legal knowledge accessible to non-experts is crucial for enhancing general legal literacy and encouraging civic participation in democracy. However, legal documents are often challenging to understand for people without legal backgrounds. In this paper, we present a novel application of large language models (LLMs) in legal education to help non-experts learn intricate legal concepts through storytelling, an effective pedagogical tool in conveying complex and abstract concepts. We also introduce a new dataset LegalStories, which consists of 294 complex legal doctrines, each accompanied by a story and a set of multiple-choice questions generated by LLMs. To construct the dataset, we experiment with various LLMs to generate legal stories explaining these concepts. Furthermore, we use an expert-in-the-loop approach to iteratively design multiple-choice questions. Then, we evaluate the effectiveness of storytelling with LLMs through randomized controlled trials (RCTs) with legal novices on 10 samples from the dataset. We find that LLM-generated stories enhance comprehension of legal concepts and interest in law among non-native speakers compared to only definitions. Moreover, stories consistently help participants relate legal concepts to their lives. Finally, we find that learning with stories shows a higher retention rate for non-native speakers in the follow-up assessment. Our work has strong implications for using LLMs in promoting teaching and learning in the legal field and beyond.

pdf abs
LePaRD: A Large-Scale Dataset of Judicial Citations to Precedent
Robert Mahari | Dominik Stammbach | Elliott Ash | Alex Pentland
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present the Legal Passage Retrieval Dataset, LePaRD. LePaRD contains millions of examples of U.S. federal judges citing precedent in context. The dataset aims to facilitate work on legal passage retrieval, a challenging practice-oriented legal retrieval and reasoning task. Legal passage retrieval seeks to predict relevant passages from precedential court decisions given the context of a legal argument. We extensively evaluate various approaches on LePaRD, and find that classification-based retrieval appears to work best. Our best models only achieve a recall of 59% when trained on data corresponding to the 10,000 most-cited passages, underscoring the difficulty of legal passage retrieval. By publishing LePaRD, we provide a large-scale and high quality resource to foster further research on legal passage retrieval. We hope that research on this practice-oriented NLP task will help expand access to justice by reducing the burden associated with legal research via computational assistance. Warning: Extracts from judicial opinions may contain offensive language.

2023

pdf abs
The Law and NLP: Bridging Disciplinary Disconnects
Robert Mahari | Dominik Stammbach | Elliott Ash | Alex Pentland
Findings of the Association for Computational Linguistics: EMNLP 2023

Legal practice is intrinsically rooted in the fabric of language, yet legal practitioners and scholars have been slow to adopt tools from natural language processing (NLP). At the same time, the legal system is experiencing an access to justice crisis, which could be partially alleviated with NLP. In this position paper, we argue that the slow uptake of NLP in legal practice is exacerbated by a disconnect between the needs of the legal community and the focus of NLP researchers. In a review of recent trends in the legal NLP literature, we find limited overlap between the legal NLP community and legal academia. Our interpretation is that some of the most popular legal NLP tasks fail to address the needs of legal practitioners. We discuss examples of legal NLP tasks that promise to bridge disciplinary disconnects and highlight interesting areas for legal NLP research that remain underexplored.

Co-authors

Eric Ma 1

Deb Roy 1

Venues

acl2
findings1