MultiFiT: Efficient Multi-lingual Language Model Fine-tuning

Julian Eisenschlos, Sebastian Ruder, Piotr Czapla, Marcin Kadras, Sylvain Gugger, Jeremy Howard


Abstract
Pretrained language models are promising particularly for low-resource languages as they only require unlabelled data. However, training existing models requires huge amounts of compute, while pretrained cross-lingual models often underperform on low-resource languages. We propose Multi-lingual language model Fine-Tuning (MultiFiT) to enable practitioners to train and fine-tune language models efficiently in their own language. In addition, we propose a zero-shot method using an existing pretrained cross-lingual model. We evaluate our methods on two widely used cross-lingual classification datasets where they outperform models pretrained on orders of magnitude more data and compute. We release all models and code.
Anthology ID:
D19-1572
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
5702–5707
Language:
URL:
https://aclanthology.org/D19-1572
DOI:
10.18653/v1/D19-1572
Bibkey:
Cite (ACL):
Julian Eisenschlos, Sebastian Ruder, Piotr Czapla, Marcin Kadras, Sylvain Gugger, and Jeremy Howard. 2019. MultiFiT: Efficient Multi-lingual Language Model Fine-tuning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5702–5707, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
MultiFiT: Efficient Multi-lingual Language Model Fine-tuning (Eisenschlos et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D19-1572.pdf
Attachment:
 D19-1572.Attachment.zip
Code
 additional community code
Data
MLDoc