@inproceedings{zeng-etal-2021-recurrent,
    title = "Recurrent Attention for Neural Machine Translation",
    author = "Zeng, Jiali  and
      Wu, Shuangzhi  and
      Yin, Yongjing  and
      Jiang, Yufan  and
      Li, Mu",
    editor = "Moens, Marie-Francine  and
      Huang, Xuanjing  and
      Specia, Lucia  and
      Yih, Scott Wen-tau",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2021.emnlp-main.258/",
    doi = "10.18653/v1/2021.emnlp-main.258",
    pages = "3216--3225",
    abstract = "Recent research questions the importance of the dot-product self-attention in Transformer models and shows that most attention heads learn simple positional patterns. In this paper, we push further in this research line and propose a novel substitute mechanism for self-attention: Recurrent AtteNtion (RAN) . RAN directly learns attention weights without any token-to-token interaction and further improves their capacity by layer-to-layer interaction. Across an extensive set of experiments on 10 machine translation tasks, we find that RAN models are competitive and outperform their Transformer counterpart in certain scenarios, with fewer parameters and inference time. Particularly, when apply RAN to the decoder of Transformer, there brings consistent improvements by about +0.5 BLEU on 6 translation tasks and +1.0 BLEU on Turkish-English translation task. In addition, we conduct extensive analysis on the attention weights of RAN to confirm their reasonableness. Our RAN is a promising alternative to build more effective and efficient NMT models."
}Markdown (Informal)
[Recurrent Attention for Neural Machine Translation](https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2021.emnlp-main.258/) (Zeng et al., EMNLP 2021)
ACL
- Jiali Zeng, Shuangzhi Wu, Yongjing Yin, Yufan Jiang, and Mu Li. 2021. Recurrent Attention for Neural Machine Translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3216–3225, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.