Latent Part-of-Speech Sequences for Neural Machine Translation
Xuewen Yang, Yingru Liu, Dongliang Xie, Xin Wang, Niranjan Balasubramanian
Abstract
Learning target side syntactic structure has been shown to improve Neural Machine Translation (NMT). However, incorporating syntax through latent variables introduces additional complexity in inference, as the models need to marginalize over the latent syntactic structures. To avoid this, models often resort to greedy search which only allows them to explore a limited portion of the latent space. In this work, we introduce a new latent variable model, LaSyn, that captures the co-dependence between syntax and semantics, while allowing for effective and efficient inference over the latent space. LaSyn decouples direct dependence between successive latent variables, which allows its decoder to exhaustively search through the latent syntactic choices, while keeping decoding speed proportional to the size of the latent variable vocabulary. We implement LaSyn by modifying a transformer-based NMT system and design a neural expectation maximization algorithm that we regularize with part-of-speech information as the latent sequences. Evaluations on four different MT tasks show that incorporating target side syntax with LaSyn improves both translation quality, and also provides an opportunity to improve diversity.- Anthology ID:
- D19-1072
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 780–790
- Language:
- URL:
- https://aclanthology.org/D19-1072
- DOI:
- 10.18653/v1/D19-1072
- Cite (ACL):
- Xuewen Yang, Yingru Liu, Dongliang Xie, Xin Wang, and Niranjan Balasubramanian. 2019. Latent Part-of-Speech Sequences for Neural Machine Translation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 780–790, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Latent Part-of-Speech Sequences for Neural Machine Translation (Yang et al., EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/D19-1072.pdf