Calibration of Machine Reading Systems at Scale

Shehzaad Dhuliawala, Leonard Adolphs, Rajarshi Das, Mrinmaya Sachan


Abstract
In typical machine learning systems, an estimate of the probability of the prediction is used to assess the system’s confidence in the prediction. This confidence measure is usually uncalibrated; i.e. the system’s confidence in the prediction does not match the true probability of the predicted output. In this paper, we present an investigation into calibrating open setting machine reading systemssuch as open-domain question answering and claim verification systems. We show that calibrating such complex systems which contain discrete retrieval and deep reading components is challenging and current calibration techniques fail to scale to these settings. We propose simple extensions to existing calibration approaches that allows us to adapt them to these settings. Our experimental results reveal that the approach works well, and can be useful to selectively predict answers when question answering systems are posed with unanswerable or out-of-the-training distribution questions.
Anthology ID:
2022.findings-acl.133
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1682–1693
Language:
URL:
https://aclanthology.org/2022.findings-acl.133
DOI:
10.18653/v1/2022.findings-acl.133
Bibkey:
Cite (ACL):
Shehzaad Dhuliawala, Leonard Adolphs, Rajarshi Das, and Mrinmaya Sachan. 2022. Calibration of Machine Reading Systems at Scale. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1682–1693, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Calibration of Machine Reading Systems at Scale (Dhuliawala et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2022.findings-acl.133.pdf
Video:
 https://preview.aclanthology.org/dois-2013-emnlp/2022.findings-acl.133.mp4
Data
Natural Questions