Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky

Ashutosh Hathidara, Julien Yu, Sebastian Schreiber


Abstract
Large language models (LLMs) are increasingly tasked with invoking enterprise APIs, yet they routinely falter when near-duplicate tools vie for the same user intent or when required arguments are left underspecified. We introduce **DiaFORGE** (**Dia**logue **F**ramework for **O**rganic **R**esponse **G**eneration **E**valuation), a disambiguation-centric, three-stage pipeline that (i) synthesizes persona-driven, multi-turn dialogues in which the assistant must distinguish among highly similar tools, (ii) performs supervised fine-tuning of open-source models with reasoning traces across 3B - 70B parameters, and (iii) evaluates real-world readiness via a dynamic suite that redeploys each model in a live agentic loop and reports end-to-end goal completion alongside conventional static metrics. On our dynamic benchmark DiaBENCH, models trained with DiaFORGE raise tool-invocation success by **27 pp over GPT-4o** and by **49 pp over Claude-3.5-Sonnet**, both under optimized prompting. To spur further research, we release an open corpus of **5000 production-grade enterprise API** specifications paired with rigorously validated, disambiguation-focused dialogues, offering a practical blueprint for building reliable, enterprise-ready tool-calling agents.
Anthology ID:
2026.findings-acl.469
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9624–9652
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.469/
DOI:
Bibkey:
Cite (ACL):
Ashutosh Hathidara, Julien Yu, and Sebastian Schreiber. 2026. Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9624–9652, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky (Hathidara et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.469.pdf
Checklist:
 2026.findings-acl.469.checklist.pdf