Antoine Scardigli
2023
MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
Steven Wang
|
Antoine Scardigli
|
Leonard Tang
|
Wei Chen
|
Dmitry Levkin
|
Anya Chen
|
Spencer Ball
|
Thomas Woodside
|
Oliver Zhang
|
Dan Hendrycks
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association’s 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.
Search
Co-authors
- Steven Wang 1
- Leonard Tang 1
- Wei Chen 1
- Dmitry Levkin 1
- Anya Chen 1
- show all...