Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models
Tianxing He, Bryan McCann, Caiming Xiong, Ehsan Hosseini-Asl
Abstract
In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.- Anthology ID:
- 2021.eacl-main.151
- Volume:
- Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
- Month:
- April
- Year:
- 2021
- Address:
- Online
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1754–1761
- Language:
- URL:
- https://aclanthology.org/2021.eacl-main.151
- DOI:
- 10.18653/v1/2021.eacl-main.151
- Cite (ACL):
- Tianxing He, Bryan McCann, Caiming Xiong, and Ehsan Hosseini-Asl. 2021. Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1754–1761, Online. Association for Computational Linguistics.
- Cite (Informal):
- Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models (He et al., EACL 2021)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2021.eacl-main.151.pdf
- Code
- salesforce/ebm_calibration_nlu
- Data
- GLUE, QNLI