RealHarm: A Collection of Real-World Language Model Application Failures

Pierre Le Jeune; Jiaen Liu; Luca Rossi; Matteo Dora

RealHarm: A Collection of Real-World Language Model Application Failures

Pierre Le Jeune, Jiaen Liu, Luca Rossi, Matteo Dora

Abstract

Language model deployments in consumer-facing applications introduce numerous risks. While existing research on harms and hazards of such applications follows top-down approaches derived from regulatory frameworks and theoretical analyses, empirical evidence of real-world failure modes remains underexplored. In this work, we introduce RealHarm, a dataset of annotated problematic interactions with AI agents built from a systematic review of publicly reported incidents. Analyzing harms, causes, and hazards specifically from the deployer’s perspective, we find that reputational damage constitutes the predominant organizational harm, while misinformation emerges as the most common hazard category. We empirically evaluate state-of-the-art guardrails and content moderation systems to probe whether such systems would have prevented the incidents, revealing a significant gap in the protection of AI applications.

Anthology ID:: 2025.llmsec-1.7
Volume:: Proceedings of the The First Workshop on LLM Security (LLMSEC)
Month:: August
Year:: 2025
Address:: Vienna, Austria
Editors:: Leon Derczynski, Jekaterina Novikova, Muhao Chen
Venues:: LLMSEC | WS
SIG:: SIGSEC
Publisher:: Association for Computational Linguistics
Note:
Pages:: 87–100
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.llmsec-1.7/
DOI:
Bibkey:
Cite (ACL):: Pierre Le Jeune, Jiaen Liu, Luca Rossi, and Matteo Dora. 2025. RealHarm: A Collection of Real-World Language Model Application Failures. In Proceedings of the The First Workshop on LLM Security (LLMSEC), pages 87–100, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: RealHarm: A Collection of Real-World Language Model Application Failures (Le Jeune et al., LLMSEC 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.llmsec-1.7.pdf
Supplementarymaterial:: 2025.llmsec-1.7.SupplementaryMaterial.txt
Supplementarymaterial:: 2025.llmsec-1.7.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Supplementarymaterial Fix data