Scheduled Multi-task Learning for Neural Chat Translation

Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, Jie Zhou


Abstract
Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task learning on small-scale chat translation data. Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training by adding a second pre-training stage between the original pre-training and fine-tuning stages. Further, we investigate where and how to schedule the dialogue-related auxiliary tasks in multiple training stages to effectively enhance the main chat translation task. Extensive experiments on four language directions (English-Chinese and English-German) verify the effectiveness and superiority of the proposed approach. Additionally, we will make the large-scale in-domain paired bilingual dialogue dataset publicly available for the research community.
Anthology ID:
2022.acl-long.300
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4375–4388
Language:
URL:
https://aclanthology.org/2022.acl-long.300
DOI:
10.18653/v1/2022.acl-long.300
Bibkey:
Cite (ACL):
Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen, and Jie Zhou. 2022. Scheduled Multi-task Learning for Neural Chat Translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4375–4388, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Scheduled Multi-task Learning for Neural Chat Translation (Liang et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.acl-long.300.pdf
Code
 xl2248/sml
Data
BMELD