The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing
Abstract
In selective prediction, a classifier is allowed to abstain from making predictions on low-confidence examples. Though this setting is interesting and important, selective prediction has rarely been examined in natural language processing (NLP) tasks. To fill this void in the literature, we study in this paper selective prediction for NLP, comparing different models and confidence estimators. We further propose a simple error regularization trick that improves confidence estimation without substantially increasing the computation budget. We show that recent pre-trained transformer models simultaneously improve both model accuracy and confidence estimation effectiveness. We also find that our proposed regularization improves confidence estimation and can be applied to other relevant scenarios, such as using classifier cascades for accuracy–efficiency trade-offs. Source code for this paper can be found at https://github.com/castorini/transformers-selective.- Anthology ID:
- 2021.acl-long.84
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1040–1051
- Language:
- URL:
- https://aclanthology.org/2021.acl-long.84
- DOI:
- 10.18653/v1/2021.acl-long.84
- Cite (ACL):
- Ji Xin, Raphael Tang, Yaoliang Yu, and Jimmy Lin. 2021. The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1040–1051, Online. Association for Computational Linguistics.
- Cite (Informal):
- The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing (Xin et al., ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.acl-long.84.pdf
- Code
- castorini/transformers-selective
- Data
- GLUE, MRPC, MultiNLI, QNLI, SST