MA-DPR: Manifold-aware Distance Metrics for Dense Passage Retrieval

Yifan Liu, Qianfeng Wen, Mark Zhao, Jiazhou Liang, Scott Sanner


Abstract
Dense Passage Retrieval (DPR) typically relies on Euclidean or cosine distance to measure query–passage relevance in embedding space, which is effective when embeddings lie on a linear manifold. However, our experiments across DPR benchmarks suggest that embeddings often lie on lower-dimensional, non-linear manifolds, especially in out-of-distribution (OOD) settings, where cosine and Euclidean distance fail to capture semantic similarity. To address this limitation, we propose a *manifold-aware* distance metric for DPR (**MA-DPR**) that models the intrinsic manifold structure of passages using a nearest-neighbor graph and measures query–passage distance based on their shortest path in this graph. We show that MA-DPR outperforms Euclidean and cosine distances by up to **26%** on OOD passage retrieval, with comparable in-distribution performance across various embedding models, while incurring a minimal increase in query inference time. Empirical evidence suggests that manifold-aware distance allows DPR to leverage context from related neighboring passages, making it effective even in the absence of direct semantic overlap. MA-DPR can be applied to a wide range of dense embedding and retrieval tasks, offering potential benefits across a wide spectrum of domains.
Anthology ID:
2025.emnlp-main.1582
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31073–31091
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1582/
DOI:
Bibkey:
Cite (ACL):
Yifan Liu, Qianfeng Wen, Mark Zhao, Jiazhou Liang, and Scott Sanner. 2025. MA-DPR: Manifold-aware Distance Metrics for Dense Passage Retrieval. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 31073–31091, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
MA-DPR: Manifold-aware Distance Metrics for Dense Passage Retrieval (Liu et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1582.pdf
Checklist:
 2025.emnlp-main.1582.checklist.pdf