Fang Tu
2026
JTPRO: A Joint Tool–Prompt Reflective Optimization Framework for Language Agents
Sandip Ghoshal | Anshul Mittal | Jyotika Singh | Miguel Ballesteros | Weiyi Sun | Fang Tu | Shailender Singh | Yassine Benajiba | Fahad Shah | Sujeeth Bharadwaj | Sujith Ravi | Dan Roth
Findings of the Association for Computational Linguistics: ACL 2026
Sandip Ghoshal | Anshul Mittal | Jyotika Singh | Miguel Ballesteros | Weiyi Sun | Fang Tu | Shailender Singh | Yassine Benajiba | Fahad Shah | Sujeeth Bharadwaj | Sujith Ravi | Dan Roth
Findings of the Association for Computational Linguistics: ACL 2026
Large language model (LLM) agents augmented with external tools often struggle as number of tools grow large and become domain-specific. In such settings, ambiguous tool descriptions and under-specified agent instructions frequently lead to tool mis-selection and incorrect slot/value instantiation. We hypothesize that this is due to two root causes: generic, one-size-fits-all prompts that ignore tool-specific nuances, and underspecified tool schemas that lack clear guidance on when and how to use each tool and how to format its parameters. We introduce Joint Tool-Prompt Reflective Optimization (JTPRO), a framework for improving tool-calling reliability in trace-supervised settings by iteratively using rollout-driven reflection to co-optimize global instructions and per-tool schema/argument descriptions for accurate tool selection and argument instantiation in large tool inventories. JTPRO is designed to preserve only tool-local cues needed for correct disambiguation and slot filling. We evaluate JTPRO across multi-tool benchmarks, which account for different number of tools using three metrics: Tool Selection Accuracy (TSA), Slot Filling Accuracy(SFA), and Overall Success Rate(OSR) (correct tool + correct slots + correct values). JTPRO consistently outperforms strong baselines, including CoT-style agents, and reflective prompt optimizers such as GEPA by 5%–20% (relative) on OSR. Ablations show that joint optimization of instructions and tool schemas is more effective and robust than optimizing either component in isolation.
MT-OSC: Path for LLMs that Get Lost in Multi-Turn Conversation
Jyotika Singh | Fang Tu | Miguel Ballesteros | Weiyi Sun | Sandip Ghoshal | Michelle Yuan | Yassine Benajiba | Sujith Ravi | Dan Roth
Findings of the Association for Computational Linguistics: ACL 2026
Jyotika Singh | Fang Tu | Miguel Ballesteros | Weiyi Sun | Sandip Ghoshal | Michelle Yuan | Yassine Benajiba | Sujith Ravi | Dan Roth
Findings of the Association for Computational Linguistics: ACL 2026
Large language models (LLMs) suffer significant performance degradation when user instructions and context are distributed over multiple conversational turns, yet multi-turn (MT) interactions dominate chat interfaces. The routine approach of appending full chat history to prompts rapidly exhausts context windows, leading to increased latency, higher computational costs, and diminishing returns as conversations extend. We introduce **MT-OSC**, a **O**ne-off **S**equential **C**ondensation framework that efficiently and automatically condenses chat history in the background without disrupting the user experience. MT-OSC employs a Condenser Agent that uses a few-shot inference-based Condenser and a lightweight Decider to selectively retain essential information, reducing token counts by up to 72% in 10-turn dialogues. Evaluated across 13 state-of-the-art LLMs and diverse multi-turn benchmarks, MT-OSC consistently narrows the multi-turn performance gap—yielding improved or preserved accuracy across datasets while remaining robust to distractors and irrelevant turns. Our results establish MT-OSC as a scalable solution for multi-turn chats, enabling richer context within constrained input spaces, reducing latency and operational cost, while balancing performance.