Wu Liu
2026
GASim: A Graph-Accelerated Hybrid Framework for Social Simulation
Xuan Zhou | Yanhui Sun | Hantao Yao | Allen He | Yongdong Zhang | Wu Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xuan Zhou | Yanhui Sun | Hantao Yao | Allen He | Yongdong Zhang | Wu Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large-scale social simulators are essential for studying complex social patterns. Prior work explores hybrid methods to scale up simulations, combining large language models (LLM)-based agents with numerical agent-based models (ABM). However, this incurs high latency due to expensive memory retrieval and sequential ABM execution. To address this challenge, we propose GASim, a graph-accelerated hybrid multi-agent framework for large-scale social simulations. For core agents driven by LLM, GASim introduces Graph-Optimized Memory (GOM) to replace intensive LLM-based retrieval pipelines with lightweight propagation over a sparse memory graph. For the majority of ordinary agents, GASim employs Graph Message Passing (GMP), substituting sequential ABM execution with parallel updates by fine-grained feature aggregation and Graph Attention Network. We further introduce Entropy-Driven Grouping (EDG) that coordinates this hybrid partitioning, leveraging information entropy to dynamically identify emergent core agents situated in information-diverse neighborhoods. Extensive experiments show that GASim not only delivers a substantial 9.94× end-to-end speedup over the traditional hybrid framework but also consumes less than 20% of baseline tokens, significantly reducing costs while preserving strong alignment with real-world public opinion trends.
2025
ACEBench: A Comprehensive Evaluation of LLM Tool Usage
Chen Chen | Xinlong Hao | Weiwen Liu | Xu Huang | Xingshan Zeng | Shuai Yu | Dexun Li | Yuefeng Huang | Xiangcheng Liu | Wang Xinzhi | Wu Liu
Findings of the Association for Computational Linguistics: EMNLP 2025
Chen Chen | Xinlong Hao | Weiwen Liu | Xu Huang | Xingshan Zeng | Shuai Yu | Dexun Li | Yuefeng Huang | Xiangcheng Liu | Wang Xinzhi | Wu Liu
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Language Models (LLMs) have demonstrated significant potential in decision-making and reasoning, particularly when integrated with various tools to effectively solve complex problems. However, existing benchmarks for evaluating LLMs’ tool usage face several limitations: (1) limited evaluation scenarios, often lacking assessments in real multi-turn dialogue contexts; (2) narrow evaluation dimensions, with insufficient detailed assessments of how LLMs use tools; and (3) reliance on LLMs or real API executions for evaluation, which introduces significant overhead. To address these challenges, we introduce ACEBench, a comprehensive benchmark for assessing tool usage in LLMs. ACEBench categorizes data into three primary types based on evaluation methodology: Normal, Special, and Agent. “Normal” evaluates tool usage in basic scenarios; “Special” evaluates tool usage in situations with ambiguous or incomplete instructions; “Agent” evaluates tool usage through multi-agent interactions to simulate real-world, multi-turn dialogues. We conducted extensive experiments using ACEBench, analyzing various LLMs in-depth and providing a more granular examination of error causes across different data types.
2006
France Telecom R&D Beijing Word Segmenter for Sighan Bakeoff 2006
Wu Liu | Heng Li | Yuan Dong | Nan He | Haitao Luo | Haila Wang
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing
Wu Liu | Heng Li | Yuan Dong | Nan He | Haitao Luo | Haila Wang
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing