Interpret and Control Dense Retrieval with Sparse Latent Features

Hao Kang, Tevin Wang, Chenyan Xiong


Abstract
Dense embeddings deliver strong retrieval performance but often lack interpretability and controllability. This paper introduces a novel approach using sparse autoencoders (SAE) to interpret and control dense embeddings via the learned latent sparse features. Our key contribution is the development of a retrieval-oriented contrastive loss, which ensures the sparse latent features remain effective for retrieval tasks and thus meaningful to interpret. Experimental results demonstrate that both the learned latent sparse features and their reconstructed embeddings retain nearly the same retrieval accuracy as the original dense vectors, affirming their faithfulness. Our further examination of the sparse latent space reveals interesting features underlying the dense embeddings and we can control the retrieval behaviors via manipulating the latent sparse features, for example, prioritizing documents from specific perspectives in the retrieval results.
Anthology ID:
2025.naacl-short.58
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
700–709
Language:
URL:
https://preview.aclanthology.org/corrections-2025-06/2025.naacl-short.58/
DOI:
10.18653/v1/2025.naacl-short.58
Bibkey:
Cite (ACL):
Hao Kang, Tevin Wang, and Chenyan Xiong. 2025. Interpret and Control Dense Retrieval with Sparse Latent Features. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers), pages 700–709, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Interpret and Control Dense Retrieval with Sparse Latent Features (Kang et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-06/2025.naacl-short.58.pdf