Xinkui Zhao

2025

pdf bib abs
AgentPro: Enhancing LLM Agents with Automated Process Supervision
Yuchen Deng | Shichen Fan | Naibo Wang | Xinkui Zhao | See-Kiong Ng
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Large language model (LLM) agents have demonstrated significant potential for addressing complex tasks through mechanisms such as chain-of-thought reasoning and tool invocation. However, current frameworks lack explicit supervision during the reasoning process, which may lead to error propagation across reasoning chains and hinder the optimization of intermediate decision-making stages. This paper introduces a novel framework, AgentPro, which enhances LLM agent performance by automated process supervision. AgentPro employs Monte Carlo Tree Search to automatically generate step-level annotations, and develops a process reward model based on these annotations to facilitate fine-grained quality assessment of reasoning. By employing a rejection sampling strategy, the LLM agent dynamically adjusts generation probability distributions to prevent the continuation of erroneous paths, thereby improving reasoning capabilities. Extensive experiments on four datasets indicate that our method significantly outperforms existing agent-based LLM methods (e.g., achieving a 6.32% increase in accuracy on the HotpotQA dataset), underscoring its proficiency in managing intricate reasoning chains.

pdf bib abs
Probing the Geometry of Truth: Consistency and Generalization of Truth Directions in LLMs Across Logical Transformations and Question Answering Tasks
Yuntai Bao | Xuhong Zhang | Tianyu Du | Xinkui Zhao | Zhengwen Feng | Hao Peng | Jianwei Yin
Findings of the Association for Computational Linguistics: ACL 2025

Large language models (LLMs) are trained on extensive datasets that encapsulate substantial world knowledge. However, their outputs often include confidently stated inaccuracies. Earlier works suggest that LLMs encode truthfulness as a distinct linear feature, termed the “truth direction”, which can classify truthfulness reliably. We address several open questions about the truth direction: (i) whether LLMs universally exhibit consistent truth directions; (ii) whether sophisticated probing techniques are necessary to identify truth directions; and (iii) how the truth direction generalizes across diverse contexts.Our findings reveal that not all LLMs exhibit consistent truth directions, with stronger representations observed in more capable models, particularly in the context of logical negation.Additionally, we demonstrate that truthfulness probes trained on declarative atomic statements can generalize effectively to logical transformations, question-answering tasks, in-context learning, and external knowledge sources.Finally, we explore the practical application of truthfulness probes in selective question-answering, illustrating their potential to improve user trust in LLM outputs.These results advance our understanding of truth directions and provide new insights into the internal representations of LLM beliefs.

Co-authors

Venues

emnlp1
findings1

Fix author