Uncertainty Estimation of Transformer Predictions for Misclassification Detection

Artem Vazhentsev, Gleb Kuzmin, Artem Shelmanov, Akim Tsvigun, Evgenii Tsymbalov, Kirill Fedyanin, Maxim Panov, Alexander Panchenko, Gleb Gusev, Mikhail Burtsev, Manvel Avetisian, Leonid Zhukov


Abstract
Uncertainty estimation (UE) of model predictions is a crucial step for a variety of tasks such as active learning, misclassification detection, adversarial attack detection, out-of-distribution detection, etc. Most of the works on modeling the uncertainty of deep neural networks evaluate these methods on image classification tasks. Little attention has been paid to UE in natural language processing. To fill this gap, we perform a vast empirical investigation of state-of-the-art UE methods for Transformer models on misclassification detection in named entity recognition and text classification tasks and propose two computationally efficient modifications, one of which approaches or even outperforms computationally intensive methods.
Anthology ID:
2022.acl-long.566
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8237–8252
Language:
URL:
https://aclanthology.org/2022.acl-long.566
DOI:
10.18653/v1/2022.acl-long.566
Bibkey:
Cite (ACL):
Artem Vazhentsev, Gleb Kuzmin, Artem Shelmanov, Akim Tsvigun, Evgenii Tsymbalov, Kirill Fedyanin, Maxim Panov, Alexander Panchenko, Gleb Gusev, Mikhail Burtsev, Manvel Avetisian, and Leonid Zhukov. 2022. Uncertainty Estimation of Transformer Predictions for Misclassification Detection. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8237–8252, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Uncertainty Estimation of Transformer Predictions for Misclassification Detection (Vazhentsev et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2022.acl-long.566.pdf
Video:
 https://preview.aclanthology.org/naacl24-info/2022.acl-long.566.mp4
Code
 airi-institute/uncertainty_transformers
Data
CoLACoNLL 2003GLUEMRPCSSTSST-2