Zhiheng Zheng

2025

pdf bib abs
A Parallelized Framework for Simulating Large-Scale LLM Agents with Realistic Environments and Interactions
Jun Zhang | Yuwei Yan | Junbo Yan | Zhiheng Zheng | Jinghua Piao | Depeng Jin | Yong Li
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)

The development of large language models (LLMs) offers a feasible approach to simulating complex behavioral patterns of individuals, enabling the reconstruction of microscopic and realistic human societal dynamics. However, this approach demands a realistic environment to provide feedback for the evolving of agents, as well as a parallelized framework to support the massive and uncertain interactions among agents and environments. To address the gaps in existing works, which lack real-world environments and struggle with complex interactions, we design a scalable framework named **AgentSociety**, which integrates realistic societal environments and parallelized interactions to support simulations of large-scale agents. Experiments demonstrate that the framework can support simulations of 30,000 agents that are faster than the wall-clock time with 24 NVIDIA A800 GPUs and the performance grows linearly with the increase of LLM computational resources. We also show that the integration of realistic environments significantly enhances the authenticity of the agents’ behaviors. Through the framework and experimental results, we are confident that deploying large-scale LLM Agents to simulate human societies becomes feasible. This will help practitioners in fields such as social sciences and management sciences to obtain new scientific discoveries via language generation technologies, and even improve planning and decision-making in the real world. The code is available at https://github.com/tsinghua-fib-lab/agentsociety/.

Embodied Question Answering (EQA) has primarily focused on indoor environments, leaving the complexities of urban settings—spanning environment, action, and perception—largely unexplored. To bridge this gap, we introduce CityEQA, a new task where an embodied agent answers open-vocabulary questions through active exploration in dynamic city spaces. To support this task, we present CityEQA-EC, the first benchmark dataset featuring 1,412 human-annotated tasks across six categories, grounded in a realistic 3D urban simulator. Moreover, we propose -Manager-Actor (PMA), a novel agent tailored for CityEQA. PMA enables long-horizon planning and hierarchical task execution: the Planner breaks down the question answering into sub-tasks, the Manager maintains an object-centric cognitive map for spatial reasoning during the process control, and the specialized Actors handle navigation, exploration, and collection sub-tasks. Experiments demonstrate that PMA achieves 60.7% of human-level answering accuracy, significantly outperforming frontier-based baselines. While promising, the performance gap compared to humans highlights the need for enhanced visual reasoning in CityEQA. This work paves the way for future advancements in urban spatial intelligence. Dataset and code are available at https://github.com/tsinghua-fib-lab/CityEQA.git.

Co-authors

Kai Xu 1

Venues

acl1
emnlp1

Fix author