ARCH: Efficient Adversarial Regularized Training with Caching
Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao
Abstract
Adversarial regularization can improve model generalization in many natural language processing tasks. However, conventional approaches are computationally expensive since they need to generate a perturbation for each sample in each epoch. We propose a new adversarial regularization method ARCH (adversarial regularization with caching), where perturbations are generated and cached once every several epochs. As caching all the perturbations imposes memory usage concerns, we adopt a K-nearest neighbors-based strategy to tackle this issue. The strategy only requires caching a small amount of perturbations, without introducing additional training time. We evaluate our proposed method on a set of neural machine translation and natural language understanding tasks. We observe that ARCH significantly eases the computational burden (saves up to 70% of computational time in comparison with conventional approaches). More surprisingly, by reducing the variance of stochastic gradients, ARCH produces a notably better (in most of the tasks) or comparable model generalization. Our code is publicly available.- Anthology ID:
- 2021.findings-emnlp.348
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4118–4131
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.348
- DOI:
- 10.18653/v1/2021.findings-emnlp.348
- Cite (ACL):
- Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, and Tuo Zhao. 2021. ARCH: Efficient Adversarial Regularized Training with Caching. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4118–4131, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- ARCH: Efficient Adversarial Regularized Training with Caching (Zuo et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2021.findings-emnlp.348.pdf
- Code
- SimiaoZuo/Caching-Adv
- Data
- ANLI, CoLA, GLUE, MRPC, SST, SST-2