Stop Hardening Everything: A Training-Free Neuron-Level Defense for Neural Ranking Models
Yu-An Liu, Ruqing Zhang, Hongru Song, Jiafeng Guo, Yixing Fan, Xueqi Cheng
Abstract
While neural ranking models (NRMs) have achieved state-of-the-art performance in information retrieval, they remain highly vulnerable to imperceptible adversarial perturbations. Existing defenses are predominantly data-centric, exemplified by adversarial training, which requires constructing large collections of adversarial examples. By treating NRMs as black boxes and indiscriminately optimizing all model parameters, these methods incur substantial computational cost and often degrade performance on clean data due to overfitting. In this paper, we advocate that adversarial vulnerability is not uniformly distributed across model parameters, but instead originates from specific internal units. We propose a paradigm shift toward a model-centric defense that addresses vulnerability at its architectural source, without requiring costly retraining or adversarial data generation. Specifically, we introduce Search in the Model, a novel training-free framework that performs fine-grained identification and rectification of vulnerable neurons directly within the model. By formulating neuron identification as a ranking problem, we develop a maximum marginal vulnerability criterion to precisely locate the top-K neurons most responsible for model vulnerability, and apply targeted neuronal inverse perturbation to correct them. Extensive experiments on MS MARCO and TREC 19 show that our approach outperforms state-of-the-art baselines in both defense efficiency and robustness to seen and unseen attacks, while preserving strong performance on clean data.- Anthology ID:
- 2026.acl-long.1076
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 23474–23484
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1076/
- DOI:
- Cite (ACL):
- Yu-An Liu, Ruqing Zhang, Hongru Song, Jiafeng Guo, Yixing Fan, and Xueqi Cheng. 2026. Stop Hardening Everything: A Training-Free Neuron-Level Defense for Neural Ranking Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 23474–23484, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Stop Hardening Everything: A Training-Free Neuron-Level Defense for Neural Ranking Models (Liu et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1076.pdf