Biancen Xie
2025
LLM-Based Dialogue Labeling for Multiturn Adaptive RAG
Zhiyu Chen
|
Biancen Xie
|
Sidarth Srinivasan
|
Manikandarajan Ramanathan
|
Rajashekar Maragoud
|
Qun Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Customer service often relies on human agents, which, while effective, can be costly and slower to scale. Recent advancements in intelligent chatbots, particularly Retrieval-Augmented Generation (RAG) models, have significantly enhanced efficiency by integrating large language models with external knowledge retrieval. However, developing a multi-turn RAG-based chatbot for real-world customer service presents additional complexities, requiring components like adaptive retrieval and query reformulation. These components typically require substantial annotated data, which is often scarce. To overcome this limitation, we propose methods to automatically generate labels for these components using real customer-agent dialogue data. Specifically, we introduce two labeling strategies for adaptive retrieval: an intent-guided strategy and an explanation-based strategy, along with two query reformulation strategies: natural language query reformulation and keyword-based reformulation. Our experiments reveal that the explanation-based strategy yields the best results for adaptive retrieval, while the keyword-based reformulation improves document retrieval quality.Our findings offer valuable insights for practitioners working on multi-turn RAG systems.