CMDL: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset
Wanhong Huang, Yi Feng, Chuanyi Li, Honghan Wu, Jidong Ge, Vincent Ng
Abstract
Legal Judgment Prediction (LJP) has attracted significant attention in recent years. However, previous studies have primarily focused on cases involving only a single defendant, skipping multi-defendant cases due to complexity and difficulty. To advance research, we introduce CMDL, a large-scale real-world Chinese Multi-Defendant LJP dataset, which consists of over 393,945 cases with nearly 1.2 million defendants in total. For performance evaluation, we propose case-level evaluation metrics dedicated for the multi-defendant scenario. Experimental results on CMDL show existing SOTA approaches demonstrate weakness when applied to cases involving multiple defendants. We highlight several challenges that require attention and resolution.- Anthology ID:
- 2024.findings-acl.351
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5895–5906
- Language:
- URL:
- https://aclanthology.org/2024.findings-acl.351
- DOI:
- 10.18653/v1/2024.findings-acl.351
- Cite (ACL):
- Wanhong Huang, Yi Feng, Chuanyi Li, Honghan Wu, Jidong Ge, and Vincent Ng. 2024. CMDL: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset. In Findings of the Association for Computational Linguistics: ACL 2024, pages 5895–5906, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- CMDL: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset (Huang et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.findings-acl.351.pdf