Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation

Parnia Bahar, Christopher Brix, Hermann Ney


Abstract
This work investigates an alternative model for neural machine translation (NMT) and proposes a novel architecture, where we employ a multi-dimensional long short-term memory (MDLSTM) for translation modelling. In the state-of-the-art methods, source and target sentences are treated as one-dimensional sequences over time, while we view translation as a two-dimensional (2D) mapping using an MDLSTM layer to define the correspondence between source and target words. We extend beyond the current sequence to sequence backbone NMT models to a 2D structure in which the source and target sentences are aligned with each other in a 2D grid. Our proposed topology shows consistent improvements over attention-based sequence to sequence model on two WMT 2017 tasks, German<->English.
Anthology ID:
D18-1335
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
3009–3015
Language:
URL:
https://aclanthology.org/D18-1335
DOI:
10.18653/v1/D18-1335
Bibkey:
Cite (ACL):
Parnia Bahar, Christopher Brix, and Hermann Ney. 2018. Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3009–3015, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Towards Two-Dimensional Sequence to Sequence Model in Neural Machine Translation (Bahar et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/D18-1335.pdf