CONVERSER: Few-shot Conversational Dense Retrieval with Synthetic Data Generation
Chao-Wei Huang, Chen-Yu Hsu, Tsu-Yuan Hsu, Chen-An Li, Yun-Nung Chen
Abstract
Conversational search provides a natural interface for information retrieval (IR). Recent approaches have demonstrated promising results in applying dense retrieval to conversational IR. However, training dense retrievers requires large amounts of in-domain paired data. This hinders the development of conversational dense retrievers, as abundant in-domain conversations are expensive to collect. In this paper, we propose Converser, a framework for training conversational dense retrievers with at most 6 examples of in-domain dialogues. Specifically, we utilize the in-context learning capability of large language models to generate conversational queries given a passage in the retrieval corpus. Experimental results on conversational retrieval benchmarks OR-QuAC and TREC CAsT 19 show that the proposed Converser achieves comparable performance to fully-supervised models, demonstrating the effectiveness of our proposed framework in few-shot conversational dense retrieval. All source code and generated datasets are available: https://github.com/MiuLab/CONVERSER- Anthology ID:
- 2023.sigdial-1.34
- Volume:
- Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Month:
- September
- Year:
- 2023
- Address:
- Prague, Czechia
- Editors:
- Svetlana Stoyanchev, Shafiq Joty, David Schlangen, Ondrej Dusek, Casey Kennington, Malihe Alikhani
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 381–387
- Language:
- URL:
- https://aclanthology.org/2023.sigdial-1.34
- DOI:
- 10.18653/v1/2023.sigdial-1.34
- Cite (ACL):
- Chao-Wei Huang, Chen-Yu Hsu, Tsu-Yuan Hsu, Chen-An Li, and Yun-Nung Chen. 2023. CONVERSER: Few-shot Conversational Dense Retrieval with Synthetic Data Generation. In Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 381–387, Prague, Czechia. Association for Computational Linguistics.
- Cite (Informal):
- CONVERSER: Few-shot Conversational Dense Retrieval with Synthetic Data Generation (Huang et al., SIGDIAL 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.sigdial-1.34.pdf