Continual Learning with Semi-supervised Contrastive Distillation for Incremental Neural Machine Translation

Yunlong Liang, Fandong Meng, Jiaan Wang, Jinan Xu, Yufeng Chen, Jie Zhou


Abstract
Incrementally expanding the capability of an existing translation model to solve new domain tasks over time is a fundamental and practical problem, which usually suffers from catastrophic forgetting. Generally, multi-domain learning can be seen as a good solution. However, there are two drawbacks: 1) it requires having the training data for all domains available at the same time, which may be unrealistic due to storage or privacy concerns; 2) it requires re-training the model on the data of all domains from scratch when adding a new domain and this is time-consuming and computationally expensive. To address these issues, we present a semi-supervised contrastive distillation framework for incremental neural machine translation. Specifically, to avoid catastrophic forgetting, we propose to exploit unlabeled data from the same distributions of the older domains through knowledge distillation. Further, to ensure the distinct domain characteristics in the model as the number of domains increases, we devise a cross-domain contrastive objective to enhance the distilled knowledge. Extensive experiments on domain translation benchmarks show that our approach, without accessing any previous training data or re-training on all domains from scratch, can significantly prevent the model from forgetting previously learned knowledge while obtaining good performance on the incrementally added domains. The code and data with step-by-step instructions will be released upon acceptance.
Anthology ID:
2024.acl-long.588
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10914–10928
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2024.acl-long.588/
DOI:
10.18653/v1/2024.acl-long.588
Bibkey:
Cite (ACL):
Yunlong Liang, Fandong Meng, Jiaan Wang, Jinan Xu, Yufeng Chen, and Jie Zhou. 2024. Continual Learning with Semi-supervised Contrastive Distillation for Incremental Neural Machine Translation. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10914–10928, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Continual Learning with Semi-supervised Contrastive Distillation for Incremental Neural Machine Translation (Liang et al., ACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2024.acl-long.588.pdf