MoNMT: Modularly Leveraging Monolingual and Bilingual Knowledge for Neural Machine Translation
Jianhui Pang, Baosong Yang, Derek F. Wong, Dayiheng Liu, Xiangpeng Wei, Jun Xie, Lidia S. Chao
Abstract
The effective use of monolingual and bilingual knowledge represents a critical challenge within the neural machine translation (NMT) community. In this paper, we propose a modular strategy that facilitates the cooperation of these two types of knowledge in translation tasks, while avoiding the issue of catastrophic forgetting and exhibiting superior model generalization and robustness. Our model is comprised of three functionally independent modules: an encoding module, a decoding module, and a transferring module. The former two acquire large-scale monolingual knowledge via self-supervised learning, while the latter is trained on parallel data and responsible for transferring latent features between the encoding and decoding modules. Extensive experiments in multi-domain translation tasks indicate our model yields remarkable performance, with up to 7 BLEU improvements in out-of-domain tests over the conventional pretrain-and-finetune approach. Our codes are available at https://github.com/NLP2CT/MoNMT.- Anthology ID:
- 2024.lrec-main.1010
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 11560–11573
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2024.lrec-main.1010/
- DOI:
- Cite (ACL):
- Jianhui Pang, Baosong Yang, Derek F. Wong, Dayiheng Liu, Xiangpeng Wei, Jun Xie, and Lidia S. Chao. 2024. MoNMT: Modularly Leveraging Monolingual and Bilingual Knowledge for Neural Machine Translation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 11560–11573, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- MoNMT: Modularly Leveraging Monolingual and Bilingual Knowledge for Neural Machine Translation (Pang et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2024.lrec-main.1010.pdf