Modeling Context With Linear Attention for Scalable Document-Level Translation

Zhaofeng Wu, Hao Peng, Nikolaos Pappas, Noah A. Smith


Abstract
Document-level machine translation leverages inter-sentence dependencies to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are difficult to scale to long documents as their attention layers have quadratic complexity in the sequence length. Recent efforts on efficient attention improve scalability, but their effect on document translation remains unexplored. In this work, we investigate the efficacy of a recent linear attention model by Peng et al. (2021) on document translation and augment it with a sentential gate to promote a recency inductive bias. We evaluate the model on IWSLT 2015 and OpenSubtitles 2018 against the transformer, demonstrating substantially increased decoding speed on long sequences with similar or better BLEU scores. We show that sentential gating further improves translation quality on IWSLT.
Anthology ID:
2022.findings-emnlp.515
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6931–6939
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.515
DOI:
10.18653/v1/2022.findings-emnlp.515
Bibkey:
Cite (ACL):
Zhaofeng Wu, Hao Peng, Nikolaos Pappas, and Noah A. Smith. 2022. Modeling Context With Linear Attention for Scalable Document-Level Translation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6931–6939, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Modeling Context With Linear Attention for Scalable Document-Level Translation (Wu et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp22-frontmatter/2022.findings-emnlp.515.pdf