Abstract
Deep neural networks for Natural Language Processing (NLP) have been demonstrated to be vulnerable to textual adversarial examples. Existing black-box attacks typically require thousands of queries on the target model, making them expensive in real-world applications. In this paper, we propose a new approach that guides the word substitutions using prior knowledge from the training set to improve the attack efficiency. Specifically, we introduce Adversarial Boosting Preference (ABP), a metric that quantifies the importance of words and guides adversarial word substitutions. We then propose two query-efficient attack strategies based on ABP: query-free attack (ABPfree) and guided search attack (ABPguide). Extensive evaluations for text classification demonstrate that ABPfree generates more natural adversarial examples than existing universal attacks, ABPguide significantly reduces the number of queries by a factor of 10 500 while achieving comparable or even better performance than black-box attack baselines. Furthermore, we introduce the first ensemble attack ABPens in NLP, which gains further performance improvements and achieves better transferability and generalization by the ensemble of the ABP across different models and domains. Code is available at https://github.com/BaiDingHub/ABP.- Anthology ID:
- 2024.naacl-long.31
- Volume:
- Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 556–569
- Language:
- URL:
- https://aclanthology.org/2024.naacl-long.31
- DOI:
- 10.18653/v1/2024.naacl-long.31
- Cite (ACL):
- Zhen Yu, Zhenhua Chen, and Kun He. 2024. Query-Efficient Textual Adversarial Example Generation for Black-Box Attacks. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 556–569, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Query-Efficient Textual Adversarial Example Generation for Black-Box Attacks (Yu et al., NAACL 2024)
- PDF:
- https://preview.aclanthology.org/landing_page/2024.naacl-long.31.pdf