Yuxuan Peng
2026
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling
Jianghao Lin | Yuanyuan Shi | Xin Peng | Renjie Ding | Hairui Wang | Yuxuan Peng | Bizhe Bai | Weixi Song | Fengshuo Bai | Huacan Chai | Weinan Zhang | Fei Huang | Ying Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jianghao Lin | Yuanyuan Shi | Xin Peng | Renjie Ding | Hairui Wang | Yuxuan Peng | Bizhe Bai | Weixi Song | Fengshuo Bai | Huacan Chai | Weinan Zhang | Fei Huang | Ying Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) excel at function calling, but inference scaling has been explored mainly for unstructured generation. We propose an inference-scaling framework for structured outputs that combines fine-grained beam search with ToolPRM, a process reward model scoring each intra-call decision (function name and argument filling). We build the first fine-grained intra-call supervision dataset via function masking, rollout collection, and step-level annotation. ToolPRM outperforms outcome and coarse-grained reward models in predictive accuracy and yields consistent test-time gains on multiple function-calling benchmarks. We further show that structured generation follows “explore more but retain less”, since early JSON errors are unrecoverable.