Multilingual and cross-lingual document classification: A meta-learning approach

Niels van der Heijden, Helen Yannakoudakis, Pushkar Mishra, Ekaterina Shutova


Abstract
The great majority of languages in the world are considered under-resourced for successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in low-resource languages and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint-training when limited target-language data is available during trai-ing. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability, and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state-of-the-art on a number of languages while performing on-par on others, using only a small amount of labeled data.
Anthology ID:
2021.eacl-main.168
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1966–1976
Language:
URL:
https://aclanthology.org/2021.eacl-main.168
DOI:
10.18653/v1/2021.eacl-main.168
Bibkey:
Cite (ACL):
Niels van der Heijden, Helen Yannakoudakis, Pushkar Mishra, and Ekaterina Shutova. 2021. Multilingual and cross-lingual document classification: A meta-learning approach. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1966–1976, Online. Association for Computational Linguistics.
Cite (Informal):
Multilingual and cross-lingual document classification: A meta-learning approach (van der Heijden et al., EACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.eacl-main.168.pdf
Code
 mrvoh/meta_learning_multilingual_doc_classification
Data
MLDoc