Abstract
This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoencoders (CVAEs). It thus improves performance of machine translation models that use noisy or monolingual data, as well as in conventional settings. Extending Transformer and conditional VAEs, our proposed latent variable model measurably prevents posterior collapse by (1) using a modified evidence lower bound (ELBO) objective which promotes mutual information between the latent variable and the target, and (2) guiding the latent variable with an auxiliary bag-of-words prediction task. As a result, the proposed model yields improved translation quality compared to existing variational NMT models on WMT Ro↔En and De↔En. With latent variables being effectively utilized, our model demonstrates improved robustness over non-latent Transformer in handling uncertainty: exploiting noisy source-side monolingual data (up to +3.2 BLEU), and training with weakly aligned web-mined parallel data (up to +4.7 BLEU).- Anthology ID:
- 2020.acl-main.753
- Volume:
- Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 8512–8525
- Language:
- URL:
- https://aclanthology.org/2020.acl-main.753
- DOI:
- 10.18653/v1/2020.acl-main.753
- Cite (ACL):
- Arya D. McCarthy, Xian Li, Jiatao Gu, and Ning Dong. 2020. Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8512–8525, Online. Association for Computational Linguistics.
- Cite (Informal):
- Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation (McCarthy et al., ACL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2020.acl-main.753.pdf