Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search

Wentao Shi, Zichun Yu, Fuli Feng, Xiangnan He, Chenyan Xiong


Abstract
Large Language Model (LLM) based multi-agent systems (MAS) show strong potential for tackling complex tasks through collaborative intelligence. Monte Carlo Tree Search (MCTS) based methods provide promising approaches for enhancing MAS self-training by generating synthetic data, using Q-values to estimate agent contributions. However, relying solely on Q-values may misalign with the goal of selecting data most beneficial for MAS improvement. To address this discrepancy, we propose **D**ata **I**nfluence-oriented **T**ree **S**earch (**DITS**), a novel framework that incorporates influence scores to guide both tree search and data selection in data synthesis. By leveraging influence scores, we effectively identify the most impactful data for MAS improvement, thereby enhancing model performance. Furthermore, we derive a novel influence score estimation method tailored for non-differentiable metrics, significantly reducing computational overhead by calculating performance changes on the validation set. Extensive experiments on three different multi-agent tasks demonstrate the robustness and effectiveness of the proposed methods. Notably, our findings reveal that allocating more resources to estimate influence scores, rather than Q-values, during data synthesis can more effectively and efficiently enhance model training. The code is available at https://anonymous.4open.science/r/DITS-F1C4/.
Anthology ID:
2026.acl-long.296
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6540–6558
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.296/
DOI:
Bibkey:
Cite (ACL):
Wentao Shi, Zichun Yu, Fuli Feng, Xiangnan He, and Chenyan Xiong. 2026. Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6540–6558, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search (Shi et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.296.pdf
Checklist:
 2026.acl-long.296.checklist.pdf