ToM-Synth: Scaling Robust Theory of Mind in LLMs via 6,912 Structured Social Units

Guiyang Hou; Xiang Huang; Shangke Lyu; Yuchuan Wu; Weiyao Luo; Xinyu Mei; Yongliang Shen; Weiming Lu; Yongbin Li

ToM-Synth: Scaling Robust Theory of Mind in LLMs via 6,912 Structured Social Units

Guiyang Hou, Xiang Huang, Shangke Lyu, Yuchuan Wu, Weiyao Luo, Xinyu Mei, Yongliang Shen, Weiming Lu, Yongbin Li

Abstract

Theory of Mind (ToM), the ability to infer others’ mental states from behavior, is pivotal for developing machines with human-level social intelligence. Existing methods endowing LLMs with ToM fall into two paradigms: training-free methods and those repurposing ToM evaluation benchmarks as training data for RL-based fine-tuning. However, training-free methods fail to internalize the augmented ToM into the LLMs. Meanwhile, using evaluation benchmarks as training sources is conceptually problematic and, in practice, results in narrow in-domain overfitting rather than robust ToM. To address the lack of training resources within the ToM community and to empower LLMs with robust ToM, we introduce ToM-Synth, a factorial combinatorial synthesis framework of 6912 social units. This framework enables the systematic synthesis of ToM data, yielding a training dataset of 27,648 instances, termed ToM-Synth-27K. Utilizing ToM-Synth-27K for RL fine-tuning, experimental results demonstrate consistent and significant improvements across models of varying families and scales on ToM, Emotional Intelligence, and Social Commonsense benchmarks. Furthermore, we observe concurrent enhancements in IQ-related tasks (math, science, logic) and effective performance scaling with increasing data scale.

Anthology ID:: 2026.findings-acl.2113
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 42567–42582
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2113/
DOI:
Bibkey:
Cite (ACL):: Guiyang Hou, Xiang Huang, Shangke Lyu, Yuchuan Wu, Weiyao Luo, Xinyu Mei, Yongliang Shen, Weiming Lu, and Yongbin Li. 2026. ToM-Synth: Scaling Robust Theory of Mind in LLMs via 6,912 Structured Social Units. In Findings of the Association for Computational Linguistics: ACL 2026, pages 42567–42582, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ToM-Synth: Scaling Robust Theory of Mind in LLMs via 6,912 Structured Social Units (Hou et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.2113.pdf
Checklist:: 2026.findings-acl.2113.checklist.pdf

PDF Cite Search Checklist Fix data