Xianshu Peng


2025

pdf bib
GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search
Xianshu Peng | Wei Wei
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Enhancing large models for complex multi-hop question-answering has become a research focus in the Retrieval-augmented generation (RAG) area. Many existing approaches aim to mimic human thought processes by enabling large models to perform retrieval-augmented generation step by step. However, these methods can only perform single chain reasoning, which lacks the ability for multi-path exploration, strategic look-ahead, stepwise evaluation, and global selection. In addition, to effectively decompose complex problems, these methods can only rely on labor-intensive intermediate annotations for supervised fine-tuning. To address these issues, we propose GRAT, an algorithm guided by Monte Carlo Tree Search (MCTS) and process rewards. GRAT not only enables self-evaluation and self-correction but also assigns fine-grained rewards to each intermediate step in the search path. These fine-grained annotations can be used for model self-training, which enables GRAT to continuously self-update its problem analysis and reasoning capabilities. We conducted experiments on four multihop QA datasets: HotPotQA, 2WikiMultiHopQA, MuSiQue, and Bamboogle, demonstrating that GRAT outperforms various RAG-based methods. Additionally, incorporating self-training significantly enhances GRAT’s reasoning performance.

2024

pdf bib
CNEQ: Incorporating numbers into Knowledge Graph Reasoning
Xianshu Peng | Wei Wei | Kaihe Xu | Dangyang Chen
Findings of the Association for Computational Linguistics: EMNLP 2024

Complex logical reasoning over knowledge graphs lies at the heart of many semantic downstream applications and thus has been extensively explored in recent years. However, nearly all of them overlook the rich semantics of numerical entities (e.g., magnitude, unit, and distribution) and are simply treated as common entities, or even directly removed. It may severely hinder the performance of answering queries involving numerical comparison or numerical computation. To address this issue, we propose the Complex Number and Entity Query model (CNEQ), which comprises a Number-Entity Predictor and an Entity Filter. The Number-Entity Predictor can independently learn the structural and semantic features of entities and numerical values, thereby enabling better prediction of entities as well as numerical values. The Entity Filter can compare or calculate numerical values to filter out entities that meet certain numerical constraints. To evaluate our model, we generated a variety of multi-hop complex logical queries including numerical values on three widely-used Knowledge Graphs: FB15K, DB15K, and YAGO15K. Experimental results demonstrate that CNEQ achieves state-of-the-art results.