Abstract
For pretrained language models such as Google’s BERT, recent research designs several input-adaptive inference mechanisms to improve the efficiency on cloud and edge devices. In this paper, we reveal a new attack surface on input-adaptive multi-exit BERT, where the adversary imperceptibly modifies the input texts to drastically increase the average inference cost. Our proposed slow-down attack called SlowBERT integrates a new rank-and-substitute adversarial text generation algorithm to efficiently search for the perturbation which maximally delays the exiting time. With no direct access to the model internals, we further devise a time-based approximation algorithm to infer the exit position as the loss oracle. Our extensive evaluation on two popular instances of multi-exit BERT for GLUE classification tasks validates the effectiveness of SlowBERT. In the worst case, SlowBERT increases the inference cost by 4.57×, which would strongly hurt the service quality of multi-exit BERT in practice, e.g., increasing the real-time cloud services’ response times for online users.- Anthology ID:
- 2023.findings-acl.634
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9992–10007
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.634
- DOI:
- 10.18653/v1/2023.findings-acl.634
- Cite (ACL):
- Shengyao Zhang, Xudong Pan, Mi Zhang, and Min Yang. 2023. SlowBERT: Slow-down Attacks on Input-adaptive Multi-exit BERT. In Findings of the Association for Computational Linguistics: ACL 2023, pages 9992–10007, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- SlowBERT: Slow-down Attacks on Input-adaptive Multi-exit BERT (Zhang et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2023.findings-acl.634.pdf