On the Robustness of Self-Attentive Models
Yu-Lun Hsieh, Minhao Cheng, Da-Cheng Juan, Wei Wei, Wen-Lian Hsu, Cho-Jui Hsieh
Abstract
This work examines the robustness of self-attentive neural networks against adversarial input perturbations. Specifically, we investigate the attention and feature extraction mechanisms of state-of-the-art recurrent neural networks and self-attentive architectures for sentiment analysis, entailment and machine translation under adversarial attacks. We also propose a novel attack algorithm for generating more natural adversarial examples that could mislead neural models but not humans. Experimental results show that, compared to recurrent neural models, self-attentive models are more robust against adversarial perturbation. In addition, we provide theoretical explanations for their superior robustness to support our claims.- Anthology ID:
- P19-1147
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1520–1529
- Language:
- URL:
- https://aclanthology.org/P19-1147
- DOI:
- 10.18653/v1/P19-1147
- Cite (ACL):
- Yu-Lun Hsieh, Minhao Cheng, Da-Cheng Juan, Wei Wei, Wen-Lian Hsu, and Cho-Jui Hsieh. 2019. On the Robustness of Self-Attentive Models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1520–1529, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- On the Robustness of Self-Attentive Models (Hsieh et al., ACL 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/P19-1147.pdf
- Data
- MultiNLI