STeCa: Step-level Trajectory Calibration for LLM Agent Learning

Hanlin Wang; Jian Wang (王剑); Chak Tou Leong; Wenjie Li

STeCa: Step-level Trajectory Calibration for LLM Agent Learning

Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li

Abstract

Large language model (LLM)-based agents have shown promise in tackling complex tasks by interacting dynamically with the environment. Existing work primarily focuses on behavior cloning from expert demonstrations or preference learning through exploratory trajectory sampling. However, these methods often struggle to address long-horizon tasks, where suboptimal actions accumulate step by step, causing agents to deviate from correct task trajectories.To address this, we highlight the importance of timely calibration and the need to automatically construct calibration trajectories for training agents. We propose Step-Level Trajectory Calibration (STeCa), a novel framework for LLM agent learning. Specifically, STeCa identifies suboptimal actions through a step-level reward comparison during exploration. It constructs calibrated trajectories using LLM-driven reflection, enabling agents to learn from improved decision-making processes. We finally leverage these calibrated trajectories with successful trajectories for reinforced training.Extensive experiments demonstrate that STeCa significantly outperforms existing methods. Further analysis highlights that timely calibration enables agents to complete tasks with greater robustness. Our code and data are available at https://github.com/WangHanLinHenry/STeCa.

Anthology ID:: 2025.findings-acl.604
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11597–11614
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.604/
DOI:
Bibkey:
Cite (ACL):: Hanlin Wang, Jian Wang, Chak Tou Leong, and Wenjie Li. 2025. STeCa: Step-level Trajectory Calibration for LLM Agent Learning. In Findings of the Association for Computational Linguistics: ACL 2025, pages 11597–11614, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: STeCa: Step-level Trajectory Calibration for LLM Agent Learning (Wang et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.604.pdf

PDF Cite Search Fix data