Abstract
Inspired by the success of contrastive learning in natural language processing, we incorporate contrastive learning into the conditional masked language model which is extensively used in non-autoregressive neural machine translation (NAT). Accordingly, we propose a Contrastive Non-autoregressive Neural Machine Translation (Con-NAT) model. Con-NAT optimizes the similarity of several different representations of the same token in the same sentence. We propose two methods to obtain various representations: Contrastive Common Mask and Contrastive Dropout. Positive pairs are various different representations of the same token, while negative pairs are representations of different tokens. In the feature space, the model with contrastive loss pulls positive pairs together and pushes negative pairs away. We conduct extensive experiments on six translation directions with different data sizes. The results demonstrate that Con-NAT showed a consistent and significant improvement in fully and iterative NAT. Con-NAT is state-of-the-art on WMT’16 Ro-En (34.18 BLEU).- Anthology ID:
- 2022.findings-emnlp.463
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2022
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6219–6231
- Language:
- URL:
- https://aclanthology.org/2022.findings-emnlp.463
- DOI:
- 10.18653/v1/2022.findings-emnlp.463
- Cite (ACL):
- Hao Cheng and Zhihua Zhang. 2022. Con-NAT: Contrastive Non-autoregressive Neural Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6219–6231, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Con-NAT: Contrastive Non-autoregressive Neural Machine Translation (Cheng & Zhang, Findings 2022)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2022.findings-emnlp.463.pdf