DataSage: Multi-agent Collaboration for Insight Discovery with External Knowledge Retrieval, Multi-role Debating, and Multi-path Reasoning

Xiaochuan Liu, Yuanfeng Song, Xiaoming Yin, Xing Chen


Abstract
In today’s data-driven era, fully automated end-to-end data analytics, particularly insight discovery, is critical for discovering actionable insights that assist organizations in making effective decisions. With the rapid advancement of large language models (LLMs), LLM-driven agents have emerged as a promising paradigm for automating insight discovery. However, existing data insight agents remain limited in several key aspects, often failing to deliver satisfactory results due to: (1) insufficient utilization of domain knowledge, (2) shallow analytical depth, and (3) error-prone code generation. To address these issues, we propose DataSage, a novel multi-agent framework that incorporates three innovative features including external knowledge retrieval to enrich the analytical context, a multi-role debating mechanism to simulate diverse analytical perspectives and deepen analytical depth, and multi-path reasoning to improve the accuracy of the generated code and insights. Extensive experiments on InsightBench demonstrate that DataSage consistently outperforms existing data insight agents across all difficulty levels, improving by 7.5% and 13.9% respectively in the insight-level and summary-level metrics. It offers an effective solution for automated data insight discovery.
Anthology ID:
2026.findings-acl.309
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6216–6250
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.309/
DOI:
Bibkey:
Cite (ACL):
Xiaochuan Liu, Yuanfeng Song, Xiaoming Yin, and Xing Chen. 2026. DataSage: Multi-agent Collaboration for Insight Discovery with External Knowledge Retrieval, Multi-role Debating, and Multi-path Reasoning. In Findings of the Association for Computational Linguistics: ACL 2026, pages 6216–6250, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
DataSage: Multi-agent Collaboration for Insight Discovery with External Knowledge Retrieval, Multi-role Debating, and Multi-path Reasoning (Liu et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.309.pdf
Checklist:
 2026.findings-acl.309.checklist.pdf