Laura Yao
2026
Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction
Xu Shen | Qi Zhang | Song Wang | Zhen Tan | Xinyu Zhao | Laura Yao | Vaishnav Tadiparthi | Hossein Nourkhiz Mahjoub | Ehsan Moradi Pari | Kwonjoon Lee | Tianlong Chen
Findings of the Association for Computational Linguistics: ACL 2026
Xu Shen | Qi Zhang | Song Wang | Zhen Tan | Xinyu Zhao | Laura Yao | Vaishnav Tadiparthi | Hossein Nourkhiz Mahjoub | Ehsan Moradi Pari | Kwonjoon Lee | Tianlong Chen
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Model based multi-agent systems (MAS) excel at collaborative problem solving but remain brittle to cascading errors: a single faulty step can propagate across agents and disrupt the trajectory. In this paper, we present MASC, a metacognitive framework that endows MAS with real-time, unsupervised, step-level error detection and self-correction. MASC rethinks detection as history-conditioned anomaly scoring via two complementary designs: (1) Next-Execution Reconstruction, which predicts the embedding of the next step from the query and interaction history to capture causal consistency, and (2) Prototype-Guided Enhancement, which learns a prototype prior over normal-step embeddings and uses it to stabilize reconstruction and anomaly scoring under sparse context (e.g., early steps). When an anomaly step is flagged, MASC triggers a correction agent to revise the acting agent’s output before information flows downstream. On the Who When benchmark, MASC consistently outperforms all baselines, achieving up to 7.8% AUC-ROC improvement in the challenging w/o GT setting, and further delivers consistent gains on AgentErrorBench. When plugged into diverse MAS frameworks, it delivers consistent end-to-end gains across architectures, confirming that our metacognitive monitoring and targeted correction can mitigate error propagation with minimal overhead.