Copyright Infringement by Large Language Models in the EU: Misalignment, Safeguards, and the Path Forward

Noah Scharrenberg, Chang Sun


Abstract
This position paper argues that European copyright law has struggled to keep pace with the development of large language models (LLMs), possibly creating a fundamental epistemic misalignment: copyright compliance relies on qualitative, context-dependent standards, while LLM development is governed by quantitative, proactive metrics. This gap means that technical safeguards, by themselves, may be insufficient to reliably demonstrate legal compliance. We identify several practical limitations in the existing EU legal frameworks, including ambiguous “lawful access” rules, fragmented opt-outs, and vague disclosure duties. We then discuss technical measures such as provenance-first data governance, machine unlearning for post-hoc removal, and synthetic data generation, showing their promise but also their limits.Finally, we propose a path forward grounded in legal-technical co-design, suggesting directions for standardising machine-readable opt-outs, disclosure templates, clarifying core legal terms, and developing legally-informed benchmarks and evidence standards. We conclude that such an integrated framework is essential to make compliance auditable, thus protected creators’ rights while enabling responsible AI innovation at scale.
Anthology ID:
2025.nllp-1.9
Volume:
Proceedings of the Natural Legal Language Processing Workshop 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Nikolaos Aletras, Ilias Chalkidis, Leslie Barrett, Cătălina Goanță, Daniel Preoțiuc-Pietro, Gerasimos Spanakis
Venues:
NLLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
125–134
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.nllp-1.9/
DOI:
Bibkey:
Cite (ACL):
Noah Scharrenberg and Chang Sun. 2025. Copyright Infringement by Large Language Models in the EU: Misalignment, Safeguards, and the Path Forward. In Proceedings of the Natural Legal Language Processing Workshop 2025, pages 125–134, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Copyright Infringement by Large Language Models in the EU: Misalignment, Safeguards, and the Path Forward (Scharrenberg & Sun, NLLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.nllp-1.9.pdf