Predicting generalization performance with correctness discriminators

Yuekun Yao, Alexander Koller


Abstract
The ability to predict an NLP model’s accuracy on unseen, potentially out-of-distribution data is a prerequisite for trustworthiness. We present a novel model that establishes upper and lower bounds on the accuracy, without requiring gold labels for the unseen data. We achieve this by training a *discriminator* which predicts whether the output of a given sequence-to-sequence model is correct or not. We show across a variety of tagging, parsing, and semantic parsing tasks that the gold accuracy is reliably between the predicted upper and lower bounds, and that these bounds are remarkably close together.
Anthology ID:
2024.findings-emnlp.686
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11725–11739
Language:
URL:
https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-emnlp.686/
DOI:
10.18653/v1/2024.findings-emnlp.686
Bibkey:
Cite (ACL):
Yuekun Yao and Alexander Koller. 2024. Predicting generalization performance with correctness discriminators. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 11725–11739, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Predicting generalization performance with correctness discriminators (Yao & Koller, Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-emnlp.686.pdf