Cross-lingual Annotation Projection Is Effective for Neural Part-of-Speech Tagging

Matthias Huck, Diana Dutka, Alexander Fraser


Abstract
We tackle the important task of part-of-speech tagging using a neural model in the zero-resource scenario, where we have no access to gold-standard POS training data. We compare this scenario with the low-resource scenario, where we have access to a small amount of gold-standard POS training data. Our experiments focus on Ukrainian as a representative of under-resourced languages. Russian is highly related to Ukrainian, so we exploit gold-standard Russian POS tags. We consider four techniques to perform Ukrainian POS tagging: zero-shot tagging and cross-lingual annotation projection (for the zero-resource scenario), and compare these with self-training and multilingual learning (for the low-resource scenario). We find that cross-lingual annotation projection works particularly well in the zero-resource scenario.
Anthology ID:
W19-1425
Volume:
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects
Month:
June
Year:
2019
Address:
Ann Arbor, Michigan
Editors:
Marcos Zampieri, Preslav Nakov, Shervin Malmasi, Nikola Ljubešić, Jörg Tiedemann, Ahmed Ali
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
223–233
Language:
URL:
https://aclanthology.org/W19-1425
DOI:
10.18653/v1/W19-1425
Bibkey:
Cite (ACL):
Matthias Huck, Diana Dutka, and Alexander Fraser. 2019. Cross-lingual Annotation Projection Is Effective for Neural Part-of-Speech Tagging. In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 223–233, Ann Arbor, Michigan. Association for Computational Linguistics.
Cite (Informal):
Cross-lingual Annotation Projection Is Effective for Neural Part-of-Speech Tagging (Huck et al., VarDial 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/W19-1425.pdf