Nada Ahmed Sharaf
2022
Adapting Large Multilingual Machine Translation Models to Unseen Low Resource Languages via Vocabulary Substitution and Neuron Selection
Mohamed A Abdelghaffar
|
Amr El Mogy
|
Nada Ahmed Sharaf
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)
We propose a method to adapt large Multilingual Machine Translation models to a low resource language (LRL) that was not included during the pre-training/training phases. We use neuron-ranking analysis to select neurons that are most influential to the high resource language (HRL) and fine-tune only this subset of the deep neural network’s neurons. We experiment with three mechanisms to compute such ranking. To allow for the potential difference in writing scripts between the HRL and LRL we utilize an alignment model to substitute HRL elements of the predefined vocab with appropriate LRL ones. Our method improves on both zero-shot and the stronger baseline of directly fine-tuning the model on the low-resource data by 3 BLEU points in X -> E and 1.6 points in E -> X.We also show that as we simulate smaller data amounts, the gap between our method and direct fine-tuning continues to widen.