Abstract
Historically speaking, the German legal language is widely neglected in NLP research, especially in summarization systems, as most of them are based on English newspaper articles. In this paper, we propose the task of automatic summarization of German court rulings. Due to their complexity and length, it is of critical importance that legal practitioners can quickly identify the content of a verdict and thus be able to decide on the relevance for a given legal case. To tackle this problem, we introduce a new dataset consisting of 100k German judgments with short summaries. Our dataset has the highest compression ratio among the most common summarization datasets. German court rulings contain much structural information, so we create a pre-processing pipeline tailored explicitly to the German legal domain. Additionally, we implement multiple extractive as well as abstractive summarization systems and build a wide variety of baseline models. Our best model achieves a ROUGE-1 score of 30.50. Therefore with this work, we are laying the crucial groundwork for further research on German summarization systems.- Anthology ID:
- 2021.nllp-1.19
- Volume:
- Proceedings of the Natural Legal Language Processing Workshop 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Nikolaos Aletras, Ion Androutsopoulos, Leslie Barrett, Catalina Goanta, Daniel Preotiuc-Pietro
- Venue:
- NLLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 180–189
- Language:
- URL:
- https://aclanthology.org/2021.nllp-1.19
- DOI:
- 10.18653/v1/2021.nllp-1.19
- Cite (ACL):
- Ingo Glaser, Sebastian Moser, and Florian Matthes. 2021. Summarization of German Court Rulings. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 180–189, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Summarization of German Court Rulings (Glaser et al., NLLP 2021)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2021.nllp-1.19.pdf
- Code
- sebimo/legalsum
- Data
- BigPatent, CNN/Daily Mail