Abstract
Lexical disambiguation is a major challenge for machine translation systems, especially if some senses of a word are trained less often than others. Identifying patterns of overgeneralization requires evaluation methods that are both reliable and scalable. We propose contrastive conditioning as a reference-free black-box method for detecting disambiguation errors. Specifically, we score the quality of a translation by conditioning on variants of the source that provide contrastive disambiguation cues. After validating our method, we apply it in a case study to perform a targeted evaluation of sequence-level knowledge distillation. By probing word sense disambiguation and translation of gendered occupation names, we show that distillation-trained models tend to overgeneralize more than other models with a comparable BLEU score. Contrastive conditioning thus highlights a side effect of distillation that is not fully captured by standard evaluation metrics. Code and data to reproduce our findings are publicly available.- Anthology ID:
- 2021.emnlp-main.803
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 10246–10265
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2021.emnlp-main.803/
- DOI:
- 10.18653/v1/2021.emnlp-main.803
- Cite (ACL):
- Jannis Vamvas and Rico Sennrich. 2021. Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10246–10265, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias (Vamvas & Sennrich, EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2021.emnlp-main.803.pdf
- Code
- zurichnlp/contrastive-conditioning