Stop Hardening Everything: A Training-Free Neuron-Level Defense for Neural Ranking Models

Yu-An Liu, Ruqing Zhang, Hongru Song, Jiafeng Guo, Yixing Fan, Xueqi Cheng


Abstract
While neural ranking models (NRMs) have achieved state-of-the-art performance in information retrieval, they remain highly vulnerable to imperceptible adversarial perturbations. Existing defenses are predominantly data-centric, exemplified by adversarial training, which requires constructing large collections of adversarial examples. By treating NRMs as black boxes and indiscriminately optimizing all model parameters, these methods incur substantial computational cost and often degrade performance on clean data due to overfitting. In this paper, we advocate that adversarial vulnerability is not uniformly distributed across model parameters, but instead originates from specific internal units. We propose a paradigm shift toward a model-centric defense that addresses vulnerability at its architectural source, without requiring costly retraining or adversarial data generation. Specifically, we introduce Search in the Model, a novel training-free framework that performs fine-grained identification and rectification of vulnerable neurons directly within the model. By formulating neuron identification as a ranking problem, we develop a maximum marginal vulnerability criterion to precisely locate the top-K neurons most responsible for model vulnerability, and apply targeted neuronal inverse perturbation to correct them. Extensive experiments on MS MARCO and TREC 19 show that our approach outperforms state-of-the-art baselines in both defense efficiency and robustness to seen and unseen attacks, while preserving strong performance on clean data.
Anthology ID:
2026.acl-long.1076
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23474–23484
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1076/
DOI:
Bibkey:
Cite (ACL):
Yu-An Liu, Ruqing Zhang, Hongru Song, Jiafeng Guo, Yixing Fan, and Xueqi Cheng. 2026. Stop Hardening Everything: A Training-Free Neuron-Level Defense for Neural Ranking Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 23474–23484, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Stop Hardening Everything: A Training-Free Neuron-Level Defense for Neural Ranking Models (Liu et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1076.pdf
Checklist:
 2026.acl-long.1076.checklist.pdf