Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model
Pengwei Zhan, Jing Yang, Xiao Huang, Chunlei Jing, Jingying Li, Liming Wang
Abstract
Neural language models have achieved superior performance. However, these models also suffer from the pathology of overconfidence in the out-of-distribution examples, potentially making the model difficult to interpret and making the interpretation methods fail to provide faithful attributions. In this paper, we explain the model pathology from the view of sentence representation and argue that the counter-intuitive bias degree and direction of the out-of-distribution examples’ representation cause the pathology. We propose a Contrastive learning regularization method using Adversarial examples for Alleviating the Pathology (ConAAP), which calibrates the sentence representation of out-of-distribution examples. ConAAP generates positive and negative examples following the attribution results and utilizes adversarial examples to introduce direction information in regularization. Experiments show that ConAAP effectively alleviates the model pathology while slightly impacting the generalization ability on in-distribution examples and thus helps interpretation methods obtain more faithful results.- Anthology ID:
- 2023.acl-long.358
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6493–6508
- Language:
- URL:
- https://aclanthology.org/2023.acl-long.358
- DOI:
- 10.18653/v1/2023.acl-long.358
- Cite (ACL):
- Pengwei Zhan, Jing Yang, Xiao Huang, Chunlei Jing, Jingying Li, and Liming Wang. 2023. Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6493–6508, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model (Zhan et al., ACL 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2023.acl-long.358.pdf