Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization

Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum


Abstract
Efficient k-nearest neighbor search is a fundamental task, foundational for many problems in NLP. When the similarity is measured by dot-product between dual-encoder vectors or L2-distance, there already exist many scalable and efficient search methods. But not so when similarity is measured by more accurate and expensive black-box neural similarity models, such as cross-encoders, which jointly encode the query and candidate neighbor. The cross-encoders’ high computational cost typically limits their use to reranking candidates retrieved by a cheaper model, such as dual encoder or TF-IDF. However, the accuracy of such a two-stage approach is upper-bounded by the recall of the initial candidate set, and potentially requires additional training to align the auxiliary retrieval model with the cross-encoder model. In this paper, we present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder. Retrieval is made efficient with CUR decomposition, a matrix decomposition approach that approximates all pairwise cross-encoder distances from a small subset of rows and columns of the distance matrix. Indexing items using our approach is computationally cheaper than training an auxiliary dual-encoder model through distillation. Empirically, for k > 10, our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods that re-rank items retrieved using a dual-encoder or TF-IDF.
Anthology ID:
2022.emnlp-main.140
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2171–2194
Language:
URL:
https://aclanthology.org/2022.emnlp-main.140
DOI:
Bibkey:
Cite (ACL):
Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, and Andrew McCallum. 2022. Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2171–2194, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization (Yadav et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-ingestion/2022.emnlp-main.140.pdf