Sharpness-Aware Minimization with Dynamic Reweighting

Wenxuan Zhou, Fangyu Liu, Huan Zhang, Muhao Chen


Abstract
Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm conducts adversarial weight perturbation, encouraging the model to converge to a flat minima. SAM finds a common adversarial weight perturbation per-batch. Although per-instance adversarial weight perturbations are stronger adversaries and can potentially lead to better generalization performance, their computational cost is very high and thus it is impossible to use per-instance perturbations efficiently in SAM. In this paper, we tackle this efficiency bottleneck and propose sharpness-aware minimization with dynamic reweighting (delta-SAM). Our theoretical analysis motivates that it is possible to approach the stronger, per-instance adversarial weight perturbations using reweighted per-batch weight perturbations. delta-SAM dynamically reweights perturbation within each batch according to the theoretically principled weighting factors, serving as a good approximation to per-instance perturbation. Experiments on various natural language understanding tasks demonstrate the effectiveness of delta-SAM.
Anthology ID:
2022.findings-emnlp.417
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5686–5699
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.417
DOI:
10.18653/v1/2022.findings-emnlp.417
Bibkey:
Cite (ACL):
Wenxuan Zhou, Fangyu Liu, Huan Zhang, and Muhao Chen. 2022. Sharpness-Aware Minimization with Dynamic Reweighting. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5686–5699, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Sharpness-Aware Minimization with Dynamic Reweighting (Zhou et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2022.findings-emnlp.417.pdf