InstructoR: Instructing Unsupervised Conversational Dense Retrieval with Large Language Models

Zhuoran Jin; Pengfei Cao; Yubo Chen (陈玉博); Kang Liu (刘康); Jun Zhao (军 赵)

doi:10.18653/v1/2023.findings-emnlp.443

InstructoR: Instructing Unsupervised Conversational Dense Retrieval with Large Language Models

Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao

Abstract

Compared to traditional single-turn ad-hoc retrieval, conversational retrieval needs to handle the multi-turn conversation and understand the user’s real query intent. However, most existing methods simply fine-tune the pre-trained ad-hoc retriever on limited supervised data, making it challenging for the retriever to fully grasp the entirety of the conversation. In this paper, we find that large language models (LLMs) can accurately discover the user’s query intent from the complex conversation context and provide the supervised signal to instruct the retriever in an unsupervised manner. Therefore, we propose a novel method termed InstructoR to Instruct unsupervised conversational dense Retrieval with LLMs. We design an unsupervised training framework that employs LLMs to estimate the session-passage relevance score as the soft label to guide the retriever’s training. Specially, we devise three instructing strategies from context, query and response perspectives to calculate the relevance score more precisely, including conversational retrieval as conversation generation, question rewrite as latent variable and question response as posterior guide. Experimental results show InstructoR can bring significant improvements across various ad-hoc retrievers, even surpassing the current supervised state-of-the-art method. We also demonstrate the effectiveness of our method under low-resource and zero-shot settings. Our code is publicly available at https://github.com/jinzhuoran/InstructoR/.

Anthology ID:: 2023.findings-emnlp.443
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6649–6675
Language:
URL:: https://preview.aclanthology.org/ingest_wac_2008/2023.findings-emnlp.443/
DOI:: 10.18653/v1/2023.findings-emnlp.443
Bibkey:
Cite (ACL):: Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, and Jun Zhao. 2023. InstructoR: Instructing Unsupervised Conversational Dense Retrieval with Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 6649–6675, Singapore. Association for Computational Linguistics.
Cite (Informal):: InstructoR: Instructing Unsupervised Conversational Dense Retrieval with Large Language Models (Jin et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest_wac_2008/2023.findings-emnlp.443.pdf

PDF Cite Search Fix data