Danger Depends on the Mind: A Theory-of-Mind Grounded Dataset and Model for Context-Dependent Dangerous Speech

Yuanchen Shi, Longyin Zhang, Guodong Zhou, Fang Kong


Abstract
Dangerous speech detection is a well-studied task, but existing approaches typically treat utterances in isolation, relying on binary labels that ignore who is speaking and in what mental state. We formulate a context-dependent variant of this task by grounding it in Theory-of-Mind (ToM). In cognitive science, ToM studies how humans attribute latent mental states-such as emotions, intentions, and actions-to others. We argue that such states are key signals for assessing the risk of an utterance. Building on this view, we construct ToM-DS, a 79K-instance dataset where each utterance is paired with structured speaker profiles, ToM states (emotion, intent, action), and topic hierarchies. During data construction, we first identify context-dependent sentences and generate diverse safe and dangerous scenarios surrounding them. High-quality annotations are obtained with state-of-the-art LLMs and a multi-stage cross-agent validation pipeline, yielding a comprehensive and reliable resource for context-dependent dangerous speech detection and fine-grained risk level classification. We further propose ToMGuard, a lightweight model with a dynamic ToM attention mechanism that adaptively weighs different mental-state cues. ToMGuard outperforms strong proprietary and open-source LLMs with significantly fewer parameters. Experimental results show that ToMGuard sets a new benchmark for context-dependent dangerous speech detection and risk level classification on ToM-DS.
Anthology ID:
2026.findings-acl.322
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6457–6478
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.322/
DOI:
Bibkey:
Cite (ACL):
Yuanchen Shi, Longyin Zhang, Guodong Zhou, and Fang Kong. 2026. Danger Depends on the Mind: A Theory-of-Mind Grounded Dataset and Model for Context-Dependent Dangerous Speech. In Findings of the Association for Computational Linguistics: ACL 2026, pages 6457–6478, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Danger Depends on the Mind: A Theory-of-Mind Grounded Dataset and Model for Context-Dependent Dangerous Speech (Shi et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.322.pdf
Checklist:
 2026.findings-acl.322.checklist.pdf