Heyang Gao
2026
Learning from Cognition: Enhancing RL Efficiency for LLM Reasoning via Hierarchical Metacognitive Decomposition and Refinement
Zexu Sun | Yongcheng Zeng | Erxue Min | Heyang Gao | Bokai Ji | Dugang Liu | Xing Tang | Xiuqiang He | Xu Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zexu Sun | Yongcheng Zeng | Erxue Min | Heyang Gao | Bokai Ji | Dugang Liu | Xing Tang | Xiuqiang He | Xu Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Contemporary progress in Large Language Models (LLMs) has revealed notable inferential capacities via reinforcement learning (RL) employing verifiable rewards. However, “zero-RL” approaches relying on fixed prompt templates introduce substantial sampling inefficiencies for weak LLMs, as most problems generate invalid outputs during accuracy-driven filtration. To solve this, we propose Cog-Rethinker, a novel hierarchical metacognitive RL framework. Cog-Rethinker enhances the rollout procedure by improving sample utilization through a two-stage framework leveraging human cognition. First, it prompts the policy to decompose zero-accuracy problems into subproblems. Second, it prompts the policy to refine answers by referencing previous wrong solutions. Moreover, to enable cold-starts and maintain train-test consistency, Cog-Rethinker applies supervised fine-tuning using correct samples from these stages. Experimental results demonstrate Cog-Rethinker’s superior performance on mathematical reasoning benchmarks and its improved sample efficiency that accelerates convergence compared to baselines.
2025
GenSim: A General Social Simulation Platform with Large Language Model based Agents
Jiakai Tang | Heyang Gao | Xuchen Pan | Lei Wang | Haoran Tan | Dawei Gao | Yushuo Chen | Xu Chen | Yankai Lin | Yaliang Li | Bolin Ding | Jingren Zhou | Jun Wang | Ji-Rong Wen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)
Jiakai Tang | Heyang Gao | Xuchen Pan | Lei Wang | Haoran Tan | Dawei Gao | Yushuo Chen | Xu Chen | Yankai Lin | Yaliang Li | Bolin Ding | Jingren Zhou | Jun Wang | Ji-Rong Wen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)
With the rapid advancement of large language models (LLMs), recent years have witnessed many promising studies on leveraging LLM-based agents to simulate human social behavior. While prior work has demonstrated significant potential across various domains, much of it has focused on specific scenarios involving a limited number of agents and has lacked the ability to adapt when errors occur during simulation. To overcome these limitations, we propose a novel LLM-agent-based simulation platform called GenSim, which: (1) Abstracts a set of general functions to simplify the simulation of customized social scenarios; (2) Supports one hundred thousand agents to better simulate large-scale populations in real-world contexts; (3) Incorporates error-correction mechanisms to ensure more reliable and long-term simulations. To evaluate our platform, we assess both the efficiency of large-scale agent simulations and the effectiveness of the error-correction mechanisms. To our knowledge, GenSim represents an initial step toward a general, large-scale, and correctable social simulation platform based on LLM agents, promising to further advance the field of social science.