Posterior Differential Regularization with f-divergence for Improving Model Robustness
Hao Cheng, Xiaodong Liu, Lis Pereira, Yaoliang Yu, Jianfeng Gao
Abstract
We address the problem of enhancing model robustness through regularization. Specifically, we focus on methods that regularize the model posterior difference between clean and noisy inputs. Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework. Additionally, we generalize the posterior differential regularization to the family of f-divergences and characterize the overall framework in terms of the Jacobian matrix. Empirically, we compare those regularizations and standard BERT training on a diverse set of tasks to provide a comprehensive profile of their effect on model generalization. For both fully supervised and semi-supervised settings, we show that regularizing the posterior difference with f-divergence can result in well-improved model robustness. In particular, with a proper f-divergence, a BERT-base model can achieve comparable generalization as its BERT-large counterpart for in-domain, adversarial and domain shift scenarios, indicating the great potential of the proposed framework for enhancing NLP model robustness.- Anthology ID:
- 2021.naacl-main.85
- Volume:
- Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1078–1089
- Language:
- URL:
- https://aclanthology.org/2021.naacl-main.85
- DOI:
- 10.18653/v1/2021.naacl-main.85
- Cite (ACL):
- Hao Cheng, Xiaodong Liu, Lis Pereira, Yaoliang Yu, and Jianfeng Gao. 2021. Posterior Differential Regularization with f-divergence for Improving Model Robustness. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1078–1089, Online. Association for Computational Linguistics.
- Cite (Informal):
- Posterior Differential Regularization with f-divergence for Improving Model Robustness (Cheng et al., NAACL 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.naacl-main.85.pdf
- Code
- additional community code
- Data
- BioASQ, IMDb Movie Reviews, MRQA, MultiNLI, SQuAD, SST