Gradient Consistency-based Parameter Allocation for Multilingual Neural Machine Translation
Wenshuai Huo, Xiaocheng Feng, Yichong Huang, Chengpeng Fu, Hui Wang, Bing Qin
Abstract
Multilingual neural machine translation handles the translation of multiple languages with one unified model. However, this joint-training paradigm incurs the notorious issue of parameter interference, where the model compromises with the language diversity to find a common solution. Recent research has explored avoiding this problem by selecting certain parameters for each language direction from the original model to form language-specific sub-networks. However, determining how many parameters to choose and which parameters to select is still a serious challenge. In this work, we propose an approach called CaPA (Consistency-based Parameter Allocation), which dynamically allocates parameters of appropriate scale to each language direction based on the consistency between the gradient of the individual language and the average gradient. Specifically, CaPA allocates more parameters to languages with higher gradient consistency as these languages tend to have a more positive impact on other languages. Furthermore, considering the varying levels of interference across different parts of the model, we propose an adaptive parameter allocation based on module-level gradient consistency. Experimental results show the correlation between gradient consistency and parameter interference, as well as the effectiveness of our proposed method.- Anthology ID:
- 2024.lrec-main.696
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 7901–7912
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.696
- DOI:
- Cite (ACL):
- Wenshuai Huo, Xiaocheng Feng, Yichong Huang, Chengpeng Fu, Hui Wang, and Bing Qin. 2024. Gradient Consistency-based Parameter Allocation for Multilingual Neural Machine Translation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7901–7912, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Gradient Consistency-based Parameter Allocation for Multilingual Neural Machine Translation (Huo et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/landing_page/2024.lrec-main.696.pdf