Abstract
Despite advancements in Large Language Models (LLMs), many complex tasks are not easily solved in a single inference step, requiring the use of agentic LLMs in interactive environments. However, agentic LLMs suffer from a phenomenon known as reasoning derailment, due to the indiscriminate incorporation of observations from partially observable environments. We introduce QuBE, a method that enhances agents’ focus on task-relevant contexts, by constructing a belief state via question answering. We validate QuBE through experiments in two agentic LLM scenarios with partial observability: 1) a canonical interactive decision-making scenario using text-based game engines, and 2) an interactive retrieval-augmented generation (RAG) scenario using search engines. In the AlfWorld text-based game, QuBE outperforms established baselines by substantial margins, and in the search engine scenario, it achieves marked improvements on the BeIR zero-shot retrieval benchmark. The results demonstrate that QuBE significantly mitigates reasoning derailment, refining the decision-making process of LLM agents in partially observed environments.- Anthology ID:
- 2024.emnlp-main.1193
- Volume:
- Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 21403–21423
- Language:
- URL:
- https://aclanthology.org/2024.emnlp-main.1193
- DOI:
- 10.18653/v1/2024.emnlp-main.1193
- Cite (ACL):
- Minsoo Kim, Jongyoon Kim, Jihyuk Kim, and Seung-won Hwang. 2024. QuBE: Question-based Belief Enhancement for Agentic LLM Reasoning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 21403–21423, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- QuBE: Question-based Belief Enhancement for Agentic LLM Reasoning (Kim et al., EMNLP 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.emnlp-main.1193.pdf