UDapter: Typology-based Language Adapters for Multilingual Dependency Parsing and Sequence Labeling

Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord


Abstract
Recent advances in multilingual language modeling have brought the idea of a truly universal parser closer to reality. However, such models are still not immune to the “curse of multilinguality”: Cross-language interference and restrained model capacity remain major obstacles. To address this, we propose a novel language adaptation approach by introducing contextual language adapters to a multilingual parser. Contextual language adapters make it possible to learn adapters via language embeddings while sharing model parameters across languages based on contextual parameter generation. Moreover, our method allows for an easy but effective integration of existing linguistic typology features into the parsing model. Because not all typological features are available for every language, we further combine typological feature prediction with parsing in a multi-task model that achieves very competitive parsing performance without the need for an external prediction system for missing features. The resulting parser, UDapter, can be used for dependency parsing as well as sequence labeling tasks such as POS tagging, morphological tagging, and NER. In dependency parsing, it outperforms strong monolingual and multilingual baselines on the majority of both high-resource and low-resource (zero-shot) languages, showing the success of the proposed adaptation approach. In sequence labeling tasks, our parser surpasses the baseline on high resource languages, and performs very competitively in a zero-shot setting. Our in-depth analyses show that adapter generation via typological features of languages is key to this success.1
Anthology ID:
2022.cl-3.3
Volume:
Computational Linguistics, Volume 48, Issue 3 - September 2022
Month:
September
Year:
2022
Address:
Cambridge, MA
Venue:
CL
SIG:
Publisher:
MIT Press
Note:
Pages:
555–592
Language:
URL:
https://aclanthology.org/2022.cl-3.3
DOI:
10.1162/coli_a_00443
Bibkey:
Cite (ACL):
Ahmet Üstün, Arianna Bisazza, Gosse Bouma, and Gertjan van Noord. 2022. UDapter: Typology-based Language Adapters for Multilingual Dependency Parsing and Sequence Labeling. Computational Linguistics, 48(3):555–592.
Cite (Informal):
UDapter: Typology-based Language Adapters for Multilingual Dependency Parsing and Sequence Labeling (Üstün et al., CL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.cl-3.3.pdf
Data
Universal Dependencies