On the Choice of Auxiliary Languages for Improved Sequence Tagging

Lukas Lange, Heike Adel, Jannik Strötgen


Abstract
Recent work showed that embeddings from related languages can improve the performance of sequence tagging, even for monolingual models. In this analysis paper, we investigate whether the best auxiliary language can be predicted based on language distances and show that the most related language is not always the best auxiliary language. Further, we show that attention-based meta-embeddings can effectively combine pre-trained embeddings from different languages for sequence tagging and set new state-of-the-art results for part-of-speech tagging in five languages.
Anthology ID:
2020.repl4nlp-1.13
Volume:
Proceedings of the 5th Workshop on Representation Learning for NLP
Month:
July
Year:
2020
Address:
Online
Editors:
Spandana Gella, Johannes Welbl, Marek Rei, Fabio Petroni, Patrick Lewis, Emma Strubell, Minjoon Seo, Hannaneh Hajishirzi
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
95–102
Language:
URL:
https://aclanthology.org/2020.repl4nlp-1.13
DOI:
10.18653/v1/2020.repl4nlp-1.13
Bibkey:
Cite (ACL):
Lukas Lange, Heike Adel, and Jannik Strötgen. 2020. On the Choice of Auxiliary Languages for Improved Sequence Tagging. In Proceedings of the 5th Workshop on Representation Learning for NLP, pages 95–102, Online. Association for Computational Linguistics.
Cite (Informal):
On the Choice of Auxiliary Languages for Improved Sequence Tagging (Lange et al., RepL4NLP 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2020.repl4nlp-1.13.pdf
Video:
 http://slideslive.com/38929779