Abstract
Uncertainty estimation is an important diagnostic tool for statistical models, and is often used to assess the confidence of model predictions. Previous work shows that neural machine translation (NMT) is an intrinsically uncertain task where there are often multiple correct and semantically equivalent translations, and that well-trained NMT models produce good translations despite spreading probability mass among many semantically similar translations. These findings suggest that popular measures of uncertainty based on token- and sequence-level entropies which measure surface form diversity may not be good proxies of the more useful quantity of interest, semantic diversity. We propose to adapt similarity-sensitive Shannon entropy (S3E), a concept borrowed from theoretical ecology, for NMT. By demonstrating significantly improved correlation between S3E and task performance on quality estimation and named entity recall, we show that S3E is a useful framework for measuring uncertainty in NMT.- Anthology ID:
- 2024.eacl-long.129
- Volume:
- Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2115–2128
- Language:
- URL:
- https://aclanthology.org/2024.eacl-long.129
- DOI:
- Cite (ACL):
- Julius Cheng and Andreas Vlachos. 2024. Measuring Uncertainty in Neural Machine Translation with Similarity-Sensitive Entropy. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2115–2128, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- Measuring Uncertainty in Neural Machine Translation with Similarity-Sensitive Entropy (Cheng & Vlachos, EACL 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2024.eacl-long.129.pdf