Abstract
Well pre-trained contextualized representations from pre-trained language model (PLM) have been shown helpful for enhancing various natural language processing tasks, surely including neural machine translation (NMT). However, existing methods either consider encoder-only enhancement or rely on specific multilingual PLMs, which leads to a much larger model or give up potentially helpful knowledge from target PLMs. In this paper, we propose a new monolingual PLM-sponsored NMT model to let both encoder and decoder enjoy PLM enhancement to alleviate such obvious inconvenience. Especially, incorporating a newly proposed frequency-weighted embedding transformation algorithm, PLM embeddings can be effectively exploited in terms of the representations of the NMT decoder. We evaluate our model on IWSLT14 En-De, De-En, WMT14 En-De, and En-Fr tasks, and the results show that our proposed PLM enhancement gives significant improvement and even helps achieve new state-of-the-art.- Anthology ID:
- 2023.findings-acl.222
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3602–3613
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.222
- DOI:
- 10.18653/v1/2023.findings-acl.222
- Cite (ACL):
- Sufeng Duan and Hai Zhao. 2023. Encoder and Decoder, Not One Less for Pre-trained Language Model Sponsored NMT. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3602–3613, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Encoder and Decoder, Not One Less for Pre-trained Language Model Sponsored NMT (Duan & Zhao, Findings 2023)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2023.findings-acl.222.pdf