When Allies Turn Foes: Exploring Group Characteristics of LLM-Based Multi-Agent Collaborative Systems Under Adversarial Attacks

Jiahao Zhang, Baoshuo Kan, Tao Gong, Fu Lee Wang, Tianyong Hao


Abstract
This paper investigates the group characteristics in multi-agent collaborative systems under adversarial attacks. Adversarial agents are tasked with generating counterfactual answers to a given collaborative problem, while collaborative agents normally interact with other agents to solve the given problem. To simulate real-world collaboration scenarios as closely as possible, we evaluate the collaborative system in three different collaboration scenarios and design three different communication strategies and different group structures. Furthermore, we explored several methods to mitigate adversarial attacks, all of which have been proven effective through our experiments. To quantify the robustness of collaborative systems against such attacks, a novel metric, System Defense Index (SDI), is introduced. Finally, we conducted an in-depth analysis from the perspective of group dynamics on how adversarial agents affect multi-agent collaborative systems, which reveals similarities between the agent collaboration process and human collaboration process. The code will be made available after publication.
Anthology ID:
2025.findings-emnlp.333
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6275–6300
Language:
URL:
https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.333/
DOI:
10.18653/v1/2025.findings-emnlp.333
Bibkey:
Cite (ACL):
Jiahao Zhang, Baoshuo Kan, Tao Gong, Fu Lee Wang, and Tianyong Hao. 2025. When Allies Turn Foes: Exploring Group Characteristics of LLM-Based Multi-Agent Collaborative Systems Under Adversarial Attacks. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 6275–6300, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
When Allies Turn Foes: Exploring Group Characteristics of LLM-Based Multi-Agent Collaborative Systems Under Adversarial Attacks (Zhang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-luhme/2025.findings-emnlp.333.pdf
Checklist:
 2025.findings-emnlp.333.checklist.pdf