Scaling LLMs’ Social Reasoning: Sprinkle Cognitive “Aha Moment” into Fundamental Long-thought Logical Capabilities

Guiyang Hou, Wenqi Zhang, Zhe Zheng, Yongliang Shen, Weiming Lu


Abstract
Humans continually engage in reasoning about others’ mental states, a capability known as Theory of Mind (ToM), is essential for social interactions. While this social reasoning capability emerges naturally in human cognitive development, how has the social reasoning capability of Large Language Models (LLMs) evolved during their development process? Various datasets have been proposed to assess LLMs’ social reasoning capabilities, but each is designed with a distinct focus, and none have explored how models’ social reasoning capabilities evolve during model size scaling or reasoning tokens scaling. In light of this, we optimize the evaluation of LLMs’ social reasoning from both data and model perspectives, constructing progressively difficult levels of social reasoning data and systematically exploring how LLMs’ social reasoning capabilities evolve. Furthermore, through an in-depth analysis of DeepSeek-R1’s reasoning trajectories, we identify notable cognitive “Aha Moment” and the reasons for its reasoning errors. Experiments reveal that long-thought logical capabilities and cognitive thinking are key to scaling LLMs’ social reasoning capabilities. By equipping the Qwen2.5-32B-Instruct model with long-thought logical capabilities and cognitive thinking, we achieve an improvement of 19.0 points, attaining social reasoning performance comparable to o1-preview model.
Anthology ID:
2025.findings-acl.162
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3126–3138
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.162/
DOI:
Bibkey:
Cite (ACL):
Guiyang Hou, Wenqi Zhang, Zhe Zheng, Yongliang Shen, and Weiming Lu. 2025. Scaling LLMs’ Social Reasoning: Sprinkle Cognitive “Aha Moment” into Fundamental Long-thought Logical Capabilities. In Findings of the Association for Computational Linguistics: ACL 2025, pages 3126–3138, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Scaling LLMs’ Social Reasoning: Sprinkle Cognitive “Aha Moment” into Fundamental Long-thought Logical Capabilities (Hou et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.162.pdf