Abstract
The ability to predict an NLP model’s accuracy on unseen, potentially out-of-distribution data is a prerequisite for trustworthiness. We present a novel model that establishes upper and lower bounds on the accuracy, without requiring gold labels for the unseen data. We achieve this by training a *discriminator* which predicts whether the output of a given sequence-to-sequence model is correct or not. We show across a variety of tagging, parsing, and semantic parsing tasks that the gold accuracy is reliably between the predicted upper and lower bounds, and that these bounds are remarkably close together.- Anthology ID:
- 2024.findings-emnlp.686
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11725–11739
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-emnlp.686/
- DOI:
- 10.18653/v1/2024.findings-emnlp.686
- Cite (ACL):
- Yuekun Yao and Alexander Koller. 2024. Predicting generalization performance with correctness discriminators. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 11725–11739, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Predicting generalization performance with correctness discriminators (Yao & Koller, Findings 2024)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2024.findings-emnlp.686.pdf