Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization
Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum
Abstract
Efficient k-nearest neighbor search is a fundamental task, foundational for many problems in NLP. When the similarity is measured by dot-product between dual-encoder vectors or L2-distance, there already exist many scalable and efficient search methods. But not so when similarity is measured by more accurate and expensive black-box neural similarity models, such as cross-encoders, which jointly encode the query and candidate neighbor. The cross-encoders’ high computational cost typically limits their use to reranking candidates retrieved by a cheaper model, such as dual encoder or TF-IDF. However, the accuracy of such a two-stage approach is upper-bounded by the recall of the initial candidate set, and potentially requires additional training to align the auxiliary retrieval model with the cross-encoder model. In this paper, we present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder. Retrieval is made efficient with CUR decomposition, a matrix decomposition approach that approximates all pairwise cross-encoder distances from a small subset of rows and columns of the distance matrix. Indexing items using our approach is computationally cheaper than training an auxiliary dual-encoder model through distillation. Empirically, for k > 10, our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods that re-rank items retrieved using a dual-encoder or TF-IDF.- Anthology ID:
- 2022.emnlp-main.140
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2171–2194
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.140
- DOI:
- 10.18653/v1/2022.emnlp-main.140
- Cite (ACL):
- Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, and Andrew McCallum. 2022. Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2171–2194, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization (Yadav et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2022.emnlp-main.140.pdf