Abstract
This work investigates an alternative model for neural machine translation (NMT) and proposes a novel architecture, where we employ a multi-dimensional long short-term memory (MDLSTM) for translation modelling. In the state-of-the-art methods, source and target sentences are treated as one-dimensional sequences over time, while we view translation as a two-dimensional (2D) mapping using an MDLSTM layer to define the correspondence between source and target words. We extend beyond the current sequence to sequence backbone NMT models to a 2D structure in which the source and target sentences are aligned with each other in a 2D grid. Our proposed topology shows consistent improvements over attention-based sequence to sequence model on two WMT 2017 tasks, German<->English.- Anthology ID:
- D18-1335
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3009–3015
- Language:
- URL:
- https://aclanthology.org/D18-1335
- DOI:
- 10.18653/v1/D18-1335
- Cite (ACL):
- Parnia Bahar, Christopher Brix, and Hermann Ney. 2018. Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3009–3015, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation (Bahar et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/D18-1335.pdf