mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations
Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, Machel Reid, Sebastian Ruder
Abstract
Multilingual sequence-to-sequence models perform poorly with increased language coverage and fail to consistently generate text in the correct target language in few-shot settings. To address these challenges, we propose mmT5, a modular multilingual sequence-to-sequence model. mmT5 utilizes language-specific modules during pre-training, which disentangle language-specific information from language-agnostic information. We identify representation drift during fine-tuning as a key limitation of modular generative models and develop strategies that enable effective zero-shot transfer. Our model outperforms mT5 at the same parameter sizes by a large margin on representative natural language understanding and generation tasks in 40+ languages. Compared to mT5, mmT5 raises the rate of generating text in the correct language under zero-shot settings from 7% to 99%, thereby greatly alleviating the source language hallucination problem.- Anthology ID:
- 2023.findings-emnlp.132
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1978–2008
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.132
- DOI:
- 10.18653/v1/2023.findings-emnlp.132
- Cite (ACL):
- Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, Machel Reid, and Sebastian Ruder. 2023. mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1978–2008, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations (Pfeiffer et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/2023.findings-emnlp.132.pdf