A Novel Synthetic Dataset for Few-Shot Legal Relation Extraction in German
Shiva Banasaz Nouri, Elena Leitner, Julian Moreno-Schneider, Georg Rehm
Abstract
The legal domain is particularly challenging for natural language processing due to the personal and confidential information it contains. Despite the significant advances of large language models (LLMs), applying them to relation extraction (RE) in legal texts remains challenging, not only because of the task’s linguistic and semantic complexity, but also due to privacy, compliance, and infrastructure constraints under regulations such as the EU AI Act. To address these challenges, we propose a novel synthetic dataset for German legal relation extraction, created using LLMs through a controlled, privacy-preserving, template-based pipeline. The dataset allows for reproducible and legally compliant experimentation. We benchmark it using two few-shot learning paradigms, a description-enhanced Model-Agnostic Meta-Learning (MAML) framework and Prototypical Networks with supervised contrastive loss and curriculum-aware prototype enrichment. Our results demonstrate that combining few-shot learning with structured semantic knowledge achieves robust and interpretable results, with the curriculum-aware Proto-Contrastive model reaching an F1-score of 99.83%.- Anthology ID:
- 2026.lrec-main.830
- Volume:
- Proceedings of the Fifteenth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2026
- Address:
- Palma de Mallorca, Spain
- Editors:
- Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
- Venue:
- LREC
- SIG:
- Publisher:
- ELRA Language Resource Association
- Note:
- Pages:
- 10579–10591
- Language:
- URL:
- https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.830/
- DOI:
- Cite (ACL):
- Shiva Banasaz Nouri, Elena Leitner, Julian Moreno-Schneider, and Georg Rehm. 2026. A Novel Synthetic Dataset for Few-Shot Legal Relation Extraction in German. International Conference on Language Resources and Evaluation, main:10579–10591.
- Cite (Informal):
- A Novel Synthetic Dataset for Few-Shot Legal Relation Extraction in German (Banasaz Nouri et al., LREC 2026)
- PDF:
- https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.830.pdf