Saliency-based Multi-View Mixed Language Training for Zero-shot Cross-lingual Classification

Siyu Lai, Hui Huang, Dong Jing, Yufeng Chen, Jinan Xu, Jian Liu


Abstract
Recent multilingual pre-trained models, like XLM-RoBERTa (XLM-R), have been demonstrated effective in many cross-lingual tasks. However, there are still gaps between the contextualized representations of similar words in different languages. To solve this problem, we propose a novel framework named Multi-View Mixed Language Training (MVMLT), which leverages code-switched data with multi-view learning to fine-tune XLM-R. MVMLT uses gradient-based saliency to extract keywords which are the most relevant to downstream tasks and replaces them with the corresponding words in the target language dynamically. Furthermore, MVMLT utilizes multi-view learning to encourage contextualized embeddings to align into a more refined language-invariant space. Extensive experiments with four languages show that our model achieves state-of-the-art results on zero-shot cross-lingual sentiment classification and dialogue state tracking tasks, demonstrating the effectiveness of our proposed model.
Anthology ID:
2021.findings-emnlp.55
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venues:
EMNLP | Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
599–610
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.55
DOI:
10.18653/v1/2021.findings-emnlp.55
Bibkey:
Cite (ACL):
Siyu Lai, Hui Huang, Dong Jing, Yufeng Chen, Jinan Xu, and Jian Liu. 2021. Saliency-based Multi-View Mixed Language Training for Zero-shot Cross-lingual Classification. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 599–610, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Saliency-based Multi-View Mixed Language Training for Zero-shot Cross-lingual Classification (Lai et al., Findings 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.findings-emnlp.55.pdf