Baixuan Li
2025
RASPberry: Retrieval-Augmented Monte Carlo Tree Self-Play with Reasoning Consistency for Multi-Hop Question Answering
Baixuan Li
|
Yunlong Fan
|
Tianyi Ma
|
Miao Gao
|
Chuanqi Shi
|
Zhiqiang Gao
Findings of the Association for Computational Linguistics: ACL 2025
Complex multi-hop question answering requires large language models (LLMs) not only to retrieve external knowledge but also to reason over the retrieved information in order to arrive at the final solution. This involves two key challenges: (i) how to effectively explore the solution space and generate more potentially correct solution candidates, and (ii) how to select the optimal solution from multiple solution candidates, both of which require a training-free approach without introducing a more powerful teacher model. To address these challenges, we propose Retrieval-Augmented Monte Carlo Tree Self-Play with Reasoning Consistency (RASPberry), which introduces a more flexible action-level sampling granularity compared to existing methods, leverages Monte Carlo Tree Search for efficient solution space exploration, and utilizes an enhanced version of reasoning consistency to guide the selection of the optimal solution. Experimental results demonstrate that our proposed RASPberry effectively tackles the two challenges outlined above, achieving more efficient RAG inference-time scaling. Our code is available at https://github.com/BaixuanLi/RASPberry.
2024
SEAVER: Attention Reallocation for Mitigating Distractions in Language Models for Conditional Semantic Textual Similarity Measurement
Baixuan Li
|
Yunlong Fan
|
Zhiqiang Gao
Findings of the Association for Computational Linguistics: EMNLP 2024
Conditional Semantic Textual Similarity (C-STS) introduces specific limiting conditions to the traditional Semantic Textual Similarity (STS) task, posing challenges for STS models. Language models employing cross-encoding demonstrate satisfactory performance in STS, yet their effectiveness significantly diminishes in C-STS. In this work, we argue that the failure is due to the fact that the redundant information in the text distracts language models from the required condition-relevant information. To alleviate this, we propose Self-Augmentation via Self-Reweighting (SEAVER), which, based solely on models’ internal attention and without the need for external auxiliary information, adaptively reallocates the model’s attention weights by emphasizing the importance of condition-relevant tokens. On the C-STS-2023 test set, SEAVER consistently improves performance of all million-scale fine-tuning baseline models (up to around 3 points), and even surpasses performance of billion-scale few-shot prompted large language models (such as GPT-4). Our code is available at https://github.com/BaixuanLi/SEAVER.