Scheduled Multi-task Learning for Neural Chat Translation
Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou
Abstract
Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task learning on small-scale chat translation data. Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training by adding a second pre-training stage between the original pre-training and fine-tuning stages. Further, we investigate where and how to schedule the dialogue-related auxiliary tasks in multiple training stages to effectively enhance the main chat translation task. Extensive experiments on four language directions (English-Chinese and English-German) verify the effectiveness and superiority of the proposed approach. Additionally, we will make the large-scale in-domain paired bilingual dialogue dataset publicly available for the research community.- Anthology ID:
- 2022.acl-long.300
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4375–4388
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.300
- DOI:
- 10.18653/v1/2022.acl-long.300
- Cite (ACL):
- Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, and Jie Zhou. 2022. Scheduled Multi-task Learning for Neural Chat Translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4375–4388, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Scheduled Multi-task Learning for Neural Chat Translation (Liang et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2022.acl-long.300.pdf
- Code
- xl2248/sml
- Data
- BMELD