RECAL: Sample-Relation Guided Confidence Calibration over Tabular Data
Wang HaoTian, Zhen Zhang, Mengting Hu, Qichao Wang, Liang Chen, Yatao Bian, Bingzhe Wu
Abstract
Tabular-format data is widely adopted in various real-world applications. Various machine learning models have achieved remarkable success in both industrial applications and data-science competitions. Despite these successes, most current machine learning methods for tabular data lack accurate confidence estimation, which is needed by some high-risk sensitive applications such as credit modeling and financial fraud detection. In this paper, we study the confidence estimation of machine learning models applied to tabular data. The key finding of our paper is that a real-world tabular dataset typically contains implicit sample relations, and this can further help to obtain a more accurate estimation. To this end, we introduce a general post-training confidence calibration framework named RECAL to calibrate the predictive confidence of current machine learning models by employing graph neural networks to model the relations between different samples. We perform extensive experiments on tabular datasets with both implicit and explicit graph structures and show that RECAL can significantly improve the calibration quality compared to the conventional method without considering the sample relations.- Anthology ID:
- 2023.findings-emnlp.482
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7246–7257
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.482
- DOI:
- 10.18653/v1/2023.findings-emnlp.482
- Cite (ACL):
- Wang HaoTian, Zhen Zhang, Mengting Hu, Qichao Wang, Liang Chen, Yatao Bian, and Bingzhe Wu. 2023. RECAL: Sample-Relation Guided Confidence Calibration over Tabular Data. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 7246–7257, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- RECAL: Sample-Relation Guided Confidence Calibration over Tabular Data (HaoTian et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.findings-emnlp.482.pdf