Optimizing the Training Schedule of Multilingual NMT using Reinforcement Learning

Alexis Allemann, Àlex R. Atrio, Andrei Popescu-Belis


Abstract
Multilingual NMT is a viable solution for translating low-resource languages (LRLs) when data from high-resource languages (HRLs) from the same language family is available. However, the training schedule, i.e. the order of presentation of languages, has an impact on the quality of such systems. Here, in a many-to-one translation setting, we propose to apply two algorithms that use reinforcement learning to optimize the training schedule of NMT: (1) Teacher-Student Curriculum Learning and (2) Deep Q Network. The former uses an exponentially smoothed estimate of the returns of each action based on the loss on monolingual or multilingual development subsets, while the latter estimates rewards using an additional neural network trained from the history of actions selected in different states of the system, together with the rewards received. On a 8-to-1 translation dataset with LRLs and HRLs, our second method improves BLEU and COMET scores with respect to both random selection of monolingual batches and shuffled multilingual batches, by adjusting the number of presentations of LRL vs. HRL batches.
Anthology ID:
2025.mtsummit-1.6
Volume:
Proceedings of Machine Translation Summit XX: Volume 1
Month:
June
Year:
2025
Address:
Geneva, Switzerland
Editors:
Pierrette Bouillon, Johanna Gerlach, Sabrina Girletti, Lise Volkart, Raphael Rubino, Rico Sennrich, Ana C. Farinha, Marco Gaido, Joke Daems, Dorothy Kenny, Helena Moniz, Sara Szoc
Venue:
MTSummit
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
65–80
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.mtsummit-1.6/
DOI:
Bibkey:
Cite (ACL):
Alexis Allemann, Àlex R. Atrio, and Andrei Popescu-Belis. 2025. Optimizing the Training Schedule of Multilingual NMT using Reinforcement Learning. In Proceedings of Machine Translation Summit XX: Volume 1, pages 65–80, Geneva, Switzerland. European Association for Machine Translation.
Cite (Informal):
Optimizing the Training Schedule of Multilingual NMT using Reinforcement Learning (Allemann et al., MTSummit 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.mtsummit-1.6.pdf