State Spaces Aren’t Enough: Machine Translation Needs Attention
Ali Vardasbi, Telmo Pessoa Pires, Robin Schmidt, Stephan Peitz
Abstract
Structured State Spaces for Sequences (S4) is a recently proposed sequence model with successful applications in various tasks, e.g. vision, language modelling, and audio. Thanks to its mathematical formulation, it compresses its input to a single hidden state, and is able to capture long range dependencies while avoiding the need for an attention mechanism. In this work, we apply S4 to Machine Translation (MT), and evaluate several encoder-decoder variants on WMT’14 and WMT’16. In contrast with the success in language modeling, we find that S4 lags behind the Transformer by approximately 4 BLEU points, and that it counter-intuitively struggles with long sentences. Finally, we show that this gap is caused by S4’s inability to summarize the full source sentence in a single hidden state, and show that we can close the gap by introducing an attention mechanism.- Anthology ID:
- 2023.eamt-1.20
- Volume:
- Proceedings of the 24th Annual Conference of the European Association for Machine Translation
- Month:
- June
- Year:
- 2023
- Address:
- Tampere, Finland
- Editors:
- Mary Nurminen, Judith Brenner, Maarit Koponen, Sirkku Latomaa, Mikhail Mikhailov, Frederike Schierl, Tharindu Ranasinghe, Eva Vanmassenhove, Sergi Alvarez Vidal, Nora Aranberri, Mara Nunziatini, Carla Parra Escartín, Mikel Forcada, Maja Popovic, Carolina Scarton, Helena Moniz
- Venue:
- EAMT
- SIG:
- Publisher:
- European Association for Machine Translation
- Note:
- Pages:
- 205–216
- Language:
- URL:
- https://aclanthology.org/2023.eamt-1.20
- DOI:
- Cite (ACL):
- Ali Vardasbi, Telmo Pessoa Pires, Robin Schmidt, and Stephan Peitz. 2023. State Spaces Aren’t Enough: Machine Translation Needs Attention. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pages 205–216, Tampere, Finland. European Association for Machine Translation.
- Cite (Informal):
- State Spaces Aren’t Enough: Machine Translation Needs Attention (Vardasbi et al., EAMT 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.eamt-1.20.pdf