Bingxu Han
2026
DisCal: Distribution-Aware Calibration for Mathematical Reasoning Under Character-Level Noisy Inputs
Bo Zhang | Jiawei Zhang | Cong Gao | Bingxu Han | Minghao Hu | Jun Zhang | Yunbo Cao | Zhunchen Luo | Wen Yao | Guotong Geng | Zhong Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Bo Zhang | Jiawei Zhang | Cong Gao | Bingxu Han | Minghao Hu | Jun Zhang | Yunbo Cao | Zhunchen Luo | Wen Yao | Guotong Geng | Zhong Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Although large reasoning models (LRMs) exhibit exceptional mathematical reasoning capabilities on clean inputs, their reasoning accuracy drops substantially in the presence of character-level noise such as typographical errors. Critically, their confidence estimates fail to reflect the corresponding decline in reasoning accuracy. While confidence calibration offers a principled solution, existing methods predominantly target clean inputs, leaving noisy scenarios largely unexplored. To address this gap, we propose DisCal (Distribution-aware Calibration), a confidence calibration framework for character-level noisy inputs. DisCal extracts uncertainty signals from both the empirical answer distribution and the model’s predictive distribution, and integrates them via a learned calibrator to produce well-calibrated confidence. Experiments across multiple mathematical reasoning benchmarks demonstrate that DisCal consistently outperforms existing calibration methods under noisy inputs, reducing Expected Calibration Error (ECE) by up to 39.21% and improving Area Under the Receiver Operating Characteristic Curve (AUROC) by up to 31.44%.
2025
SafeConf: A Confidence-Calibrated Safety Self-Evaluation Method for Large Language Models
Bo Zhang | Cong Gao | Linkang Yang | Bingxu Han | Minghao Hu | Zhunchen Luo | Guotong Geng | Xiaoying Bai | Jun Zhang | Wen Yao | Zhong Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Bo Zhang | Cong Gao | Linkang Yang | Bingxu Han | Minghao Hu | Zhunchen Luo | Guotong Geng | Xiaoying Bai | Jun Zhang | Wen Yao | Zhong Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Large language models (LLMs) have achieved groundbreaking progress in Natural Language Processing (NLP). Despite the numerous advantages of LLMs, they also pose significant safety risks. Self-evaluation mechanisms have gained increasing attention as a key safeguard to ensure safe and controllable content generation. However, LLMs often exhibit overconfidence, which seriously compromises the accuracy of safety self-evaluation. To address this challenge, we propose SafeConf, a method to enhance the safety self-evaluation capability of LLMs through confidence calibration. The method performs semantic mutations on the original safety evaluation questions and adopts a self-consistency strategy to quantify confidence based on answer accuracy on the mutated questions. Finally, these confidence scores are used to construct a dataset for fine-tuning. We conducte experiments on both Chinese and English datasets. The results show that SafeConf improves self-evaluation accuracy by an average of 5.86% and 7.79% over the state-of-the-art baseline methods on Qwen2.5-7B-Instruct and Llama3-8B-Instruct models, respectively, without affecting the general capabilities of the models.