Wataru Hashimoto


2025

pdf bib
Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks
Wataru Hashimoto | Hidetaka Kamigaito | Taro Watanabe
Findings of the Association for Computational Linguistics: NAACL 2025

Trustworthiness in model predictions is crucial for safety-critical applications in the real world. However, deep neural networks often suffer from the issues of uncertainty estimation, such as miscalibration. In this study, we propose k-Nearest Neighbor Uncertainty Estimation (kNN-UE), which is a new uncertainty estimation method that uses not only the distances from the neighbors, but also the ratio of labels in the neighbors. Experiments on sentiment analysis, natural language inference, and named entity recognition show that our proposed method outperforms the baselines and recent density-based methods in several calibration and uncertainty metrics. Moreover, our analyses indicate that approximate nearest neighbor search techniques reduce the inference overhead without significantly degrading the uncertainty estimation performance when they are appropriately combined.

2024

pdf bib
Are Data Augmentation Methods in Named Entity Recognition Applicable for Uncertainty Estimation?
Wataru Hashimoto | Hidetaka Kamigaito | Taro Watanabe
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

This work investigates the impact of data augmentation on confidence calibration and uncertainty estimation in Named Entity Recognition (NER) tasks. For the future advance of NER in safety-critical fields like healthcare and finance, it is essential to achieve accurate predictions with calibrated confidence when applying Deep Neural Networks (DNNs), including Pre-trained Language Models (PLMs), as a real-world application. However, DNNs are prone to miscalibration, which limits their applicability. Moreover, existing methods for calibration and uncertainty estimation are computational expensive. Our investigation in NER found that data augmentation improves calibration and uncertainty in cross-genre and cross-lingual setting, especially in-domain setting. Furthermore, we showed that the calibration for NER tends to be more effective when the perplexity of the sentences generated by data augmentation is lower, and that increasing the size of the augmentation further improves calibration and uncertainty.