Superficial Success vs. Internal Breakdown: An Empirical Study of Generalization in Adaptive Multi-Agent Systems

Namyeong So, Seokgyu Jang, Taeuk Kim


Abstract
Adaptive multi-agent systems (MAS) are increasingly adopted as solutions to complex problems. However, their optimization for narrow task ranges leaves it unclear whether they can function as general-purpose systems. To fill this gap, we conduct an extensive empirical study on adaptive MAS, revealing two key findings: (1) they are prone to topological overfitting, defined as failures in domain transfer; and (2) they exhibit illusory coordination, where surface-level accuracy is high but underlying agent coordination deviates from ideal MAS behavior, raising concerns about their practical effectiveness. These observations highlight the urgent need to prioritize generalization in MAS development and motivate more thorough evaluation beyond correctness of the final answer.
Anthology ID:
2026.findings-acl.753
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15328–15354
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.753/
DOI:
Bibkey:
Cite (ACL):
Namyeong So, Seokgyu Jang, and Taeuk Kim. 2026. Superficial Success vs. Internal Breakdown: An Empirical Study of Generalization in Adaptive Multi-Agent Systems. In Findings of the Association for Computational Linguistics: ACL 2026, pages 15328–15354, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Superficial Success vs. Internal Breakdown: An Empirical Study of Generalization in Adaptive Multi-Agent Systems (So et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.753.pdf
Checklist:
 2026.findings-acl.753.checklist.pdf