Shortcuts Arising from Contrast: Towards Effective and Lightweight Clean-Label Attacks in Prompt-Based Learning

Xiaopeng Xie; Ming Yan; Xiwen Zhou; Chenlong Zhao; Suli Wang; Yong Zhang; Joey Tianyi Zhou

doi:10.18653/v1/2024.emnlp-main.834

Shortcuts Arising from Contrast: Towards Effective and Lightweight Clean-Label Attacks in Prompt-Based Learning

Xiaopeng Xie, Ming Yan, Xiwen Zhou, Chenlong Zhao, Suli Wang, Yong Zhang, Joey Tianyi Zhou

Abstract

Prompt-based learning paradigm has been shown to be vulnerable to backdoor attacks. Current clean-label attack, employing a specific prompt as trigger, can achieve success without the need for external triggers and ensuring correct labeling of poisoned samples, which are more stealthy compared to the poisoned-label attack, but on the other hand, facing significant issues with false activations and pose greater challenges, necessitating a higher rate of poisoning. Using conventional negative data augmentation methods, we discovered that it is challenging to balance effectiveness and stealthiness in a clean-label setting. In addressing this issue, we are inspired by the notion that a backdoor acts as a shortcut, and posit that this shortcut stems from the contrast between the trigger and the data utilized for poisoning. In this study, we propose a method named Contrastive Shortcut Injection (CSI), by leveraging activation values, integrates trigger design and data selection strategies to craft stronger shortcut features. With extensive experiments on full-shot and few-shot text classification tasks, we empirically validate CSI’s high effectiveness and high stealthiness at low poisoning rates.

Anthology ID:: 2024.emnlp-main.834
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14966–14977
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2024.emnlp-main.834/
DOI:: 10.18653/v1/2024.emnlp-main.834
Bibkey:
Cite (ACL):: Xiaopeng Xie, Ming Yan, Xiwen Zhou, Chenlong Zhao, Suli Wang, Yong Zhang, and Joey Tianyi Zhou. 2024. Shortcuts Arising from Contrast: Towards Effective and Lightweight Clean-Label Attacks in Prompt-Based Learning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 14966–14977, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Shortcuts Arising from Contrast: Towards Effective and Lightweight Clean-Label Attacks in Prompt-Based Learning (Xie et al., EMNLP 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2024.emnlp-main.834.pdf

PDF Cite Search Fix data