mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang, Saksham Singhal, Xian-Ling Mao, Heyan Huang, Xia Song, Furu Wei
Abstract
Multilingual T5 pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks. In this paper, we improve multilingual text-to-text transfer Transformer with translation pairs (mT6). Specifically, we explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption. In addition, we propose a partially non-autoregressive objective for text-to-text pre-training. We evaluate the methods on seven multilingual benchmark datasets, including sentence classification, named entity recognition, question answering, and abstractive summarization. Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.- Anthology ID:
- 2021.emnlp-main.125
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1671–1683
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-main.125
- DOI:
- 10.18653/v1/2021.emnlp-main.125
- Cite (ACL):
- Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang, Saksham Singhal, Xian-Ling Mao, Heyan Huang, Xia Song, and Furu Wei. 2021. mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1671–1683, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs (Chi et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2021.emnlp-main.125.pdf
- Data
- MLQA, PAWS-X, TyDi QA, WikiLingua, XNLI, XQuAD