Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math Reasoning

Bohan Yao; Vikas Yadav

doi:10.18653/v1/2025.findings-emnlp.1377

Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math Reasoning

Abstract

Tool usage is a proven technique for developing high-performance reasoning in large language models (LLMs). Our work is focused on emphasizing the utility of leveraging multiple diverse tools for complex reasoning tasks. We present Multi-TAG, a Multi-Tool AGgregation-based LLM framework that utilizes multiple diverse tools to solve complex math problems over multiple reasoning steps. At each reasoning step, Multi-TAG invokes multiple tools and accepts the solution of the respective step by tools that have majority agreement on the final answer estimate. Multi-TAG strongly outperforms several standard baselines that use individual tools with the same number of runs, highlighting the importance of multi-tool invocation for solving complex reasoning tasks. We also show that naive aggregation of multiple tools at each reasoning step also leads to substantial improvements of up to 35% accuracy. Multi-TAG then further improves these gains by 7.4% on average on MATH500, AIME, AMC, and OlympiadBench.

Anthology ID:: 2025.findings-emnlp.1377
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25264–25282
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1377/
DOI:: 10.18653/v1/2025.findings-emnlp.1377
Bibkey:
Cite (ACL):: Bohan Yao and Vikas Yadav. 2025. Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math Reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 25264–25282, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Diverse Multi-tool Aggregation with Large Language Models for Enhanced Math Reasoning (Yao & Yadav, Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1377.pdf
Checklist:: 2025.findings-emnlp.1377.checklist.pdf

PDF Cite Search Checklist Fix data