Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based Retrieval
Haotong Bao, Jianjin Zhang, Qi Chen, Weihao Han, Zhengxin Zeng, Ruiheng Chang, Mingzheng Li, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang
Abstract
In Embedding Based Retrieval (EBR), Approximate Nearest Neighbor (ANN) algorithms are widely adopted for efficient large-scale search. However, recent studies reveal a query out-of-distribution (OOD) issue, where query and base embeddings follow mismatched distributions, significantly degrading ANN performance. In this work, we empirically verify the generality of this phenomenon and provide a quantitative analysis. To mitigate the distributional gap, we introduce a distribution regularizer into the encoder training objective, encouraging alignment between query and base embeddings. Extensive experiments across multiple datasets, encoders, and ANN indices show that our method consistently improves retrieval performance.- Anthology ID:
- 2025.findings-emnlp.340
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6418–6427
- Language:
- URL:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.340/
- DOI:
- 10.18653/v1/2025.findings-emnlp.340
- Cite (ACL):
- Haotong Bao, Jianjin Zhang, Qi Chen, Weihao Han, Zhengxin Zeng, Ruiheng Chang, Mingzheng Li, Hao Sun, Weiwei Deng, Feng Sun, and Qi Zhang. 2025. Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based Retrieval. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 6418–6427, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Alleviating Performance Degradation Caused by Out-of-Distribution Issues in Embedding-Based Retrieval (Bao et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.340.pdf