Qing Guo
2026
From Language to Driving: A Dual-Loop SLM-Enhanced Framework for Multi-Planner Scheduling via a Domain-Specific Language
Jiawei Liu | Xun Gong | Muli Yang | Xingrui Yu | Fen Fang | Xulei Yang | Ivor Tsang | Yunfeng hu | Hong Chen | Qing Guo
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiawei Liu | Xun Gong | Muli Yang | Xingrui Yu | Fen Fang | Xulei Yang | Ivor Tsang | Yunfeng hu | Hong Chen | Qing Guo
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Advancing from usable to collaborative autonomy requires driving systems to execute passenger instructions safely and reliably. This work formulates instruction realization as scheduling across multiple motion planners and presents a dual-loop framework that provides a transparent decision chain from natural language to vehicle control. The outer loop uses a small language model (SLM) for high-level, low-frequency semantic reasoning and schedule generation, while the inner loop performs low-level, high-frequency schedule execution and vehicle control. To compensate for the SLM’s limited capacity, the framework integrates receding-horizon scheduling to segment long-horizon instruction tasks, a domain-specific language (DSL) that restricts SLM outputs to a scheduling-oriented subspace, and reinforcement learning in high-fidelity urban traffic to refine the SLM’s DSL proficiency and scheduling performance. Experiments show that the framework improves instruction-completion rates while maintaining high safety and compliance relative to multiple baselines.
Learn Like Humans: Use Meta-cognitive Reflection for Efficient Self-Improvement
Xinmeng Hou | Bohao Qu | Wuqi Wang | Peiliang Gong | Qing Guo | Yang Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xinmeng Hou | Bohao Qu | Wuqi Wang | Peiliang Gong | Qing Guo | Yang Liu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While Large Language Models (LLMs) enable complex autonomous behavior, current agents remain constrained by static, human-designed prompts that limit adaptability. Existing self-improving frameworks attempt to bridge this gap but typically rely on inefficient, multi-turn recursive loops that incur high computational costs. To address this, we propose Metacognitive Agent with Reflective Self-improvement (MARS), a framework that achieves efficient self-evolution within a single recurrence cycle. Inspired by educational psychology, MARS mimics human learning by integrating principle-based reflection (abstracting normative rules to avoid errors) and procedural reflection (deriving step-by-step strategies for success). By synthesizing these insights into optimized instructions, MARS allows agents to systematically refine their reasoning logic without continuous online feedback. Extensive experiments on six benchmarks demonstrate that MARS outperforms state-of-the-art self-evolving systems while significantly reducing computational overhead. Code is available at https://github.com/Paparare/MARS/tree/main
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models
Zhenhong Zhou | Zherui Li | Jie Zhang | Yuanhe Zhang | Kun Wang | Yang Liu | Qing Guo
Findings of the Association for Computational Linguistics: ACL 2026
Zhenhong Zhou | Zherui Li | Jie Zhang | Yuanhe Zhang | Kun Wang | Yang Liu | Qing Guo
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Model-based Multi-Agent Systems represent a promising paradigm for tackling complex problems through agent collaboration. However, the reliance on open-ended communication exposes a fundamental vulnerability: the collaborative process itself can be exploited and disrupted. In this work, we formalize this threat class as Denial-of-Collaboration (DoC). Unlike DoS, which targets individual nodes or services, DoC attacks corrupt the collaborative structure of the system, transforming its communication topology into self-sabotage. The result is excessive resource consumption and eventual system paralysis. We introduce **CO**ntagious **R**ecursive **B**locking **A**ttacks (CORBA) as a concrete example of DoC, which employs benign yet recursively contagious instructions, forcing LLM-MASs into cycles of meaningless message passing. Critically, since our attacks are semantically benign, they easily bypass conventional safety alignments that are not designed to detect behavioral or systemic attacks. Through extensive experiments across diverse topologies and models, we demonstrate that CORBA achieves system paralysis where the baseline attacks fail. Our work reveals emerging DoC threats in current LLM-MAS security and establishes a crucial baseline for developing robust, collaboration-aware defense mechanisms.
2025
Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization
Yihao Huang | Chong Wang | Xiaojun Jia | Qing Guo | Felix Juefei-Xu | Jian Zhang | Yang Liu | Geguang Pu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yihao Huang | Chong Wang | Xiaojun Jia | Qing Guo | Felix Juefei-Xu | Jian Zhang | Yang Liu | Geguang Pu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Universal goal hijacking is a kind of prompt injection attack that forces LLMs to return a target malicious response for arbitrary normal user prompts. The previous methods achieve high attack performance while being too cumbersome and time-consuming. Also, they have concentrated solely on optimization algorithms, overlooking the crucial role of the prompt. To this end, we propose a method called POUGH that incorporates an efficient optimization algorithm and two semantics-guided prompt organization strategies. Specifically, our method starts with a sampling strategy to select representative prompts from a candidate pool, followed by a ranking strategy that prioritizes them. Given the sequentially ranked prompts, our method employs an iterative optimization algorithm to generate a fixed suffix that can concatenate to arbitrary user prompts for universal goal hijacking. Experiments conducted on four popular LLMs and ten types of target responses verified the effectiveness.