Saving Dense Retriever from Shortcut Dependency in Conversational Search

Sungdong Kim, Gangwoo Kim


Abstract
Conversational search (CS) needs a holistic understanding of conversational inputs to retrieve relevant passages. In this paper, we demonstrate the existence of a retrieval shortcut in CS, which causes models to retrieve passages solely relying on partial history while disregarding the latest question. With in-depth analysis, we first show that naively trained dense retrievers heavily exploit the shortcut and hence perform poorly when asked to answer history-independent questions. To build more robust models against shortcut dependency, we explore various hard negative mining strategies. Experimental results show that training with the model-based hard negatives effectively mitigates the dependency on the shortcut, significantly improving dense retrievers on recent CS benchmarks. In particular, our retriever outperforms the previous state-of-the-art model by 11.0 in Recall@10 on QReCC.
Anthology ID:
2022.emnlp-main.701
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10278–10287
Language:
URL:
https://aclanthology.org/2022.emnlp-main.701
DOI:
10.18653/v1/2022.emnlp-main.701
Bibkey:
Cite (ACL):
Sungdong Kim and Gangwoo Kim. 2022. Saving Dense Retriever from Shortcut Dependency in Conversational Search. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10278–10287, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Saving Dense Retriever from Shortcut Dependency in Conversational Search (Kim & Kim, EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-2023-videos/2022.emnlp-main.701.pdf