Abstract
While it has been shown that Neural Machine Translation (NMT) is highly sensitive to noisy parallel training samples, prior work treats all types of mismatches between source and target as noise. As a result, it remains unclear how samples that are mostly equivalent but contain a small number of semantically divergent tokens impact NMT training. To close this gap, we analyze the impact of different types of fine-grained semantic divergences on Transformer models. We show that models trained on synthetic divergences output degenerated text more frequently and are less confident in their predictions. Based on these findings, we introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences, improving both translation quality and model calibration on EN-FR tasks.- Anthology ID:
- 2021.acl-long.562
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7236–7249
- Language:
- URL:
- https://aclanthology.org/2021.acl-long.562
- DOI:
- 10.18653/v1/2021.acl-long.562
- Cite (ACL):
- Eleftheria Briakou and Marine Carpuat. 2021. Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 7236–7249, Online. Association for Computational Linguistics.
- Cite (Informal):
- Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences on Neural Machine Translation (Briakou & Carpuat, ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2021.acl-long.562.pdf
- Code
- awslabs/sockeye + additional community code
- Data
- WikiMatrix