Abstract
a cross-lingual neural part-of-speech tagger that learns from disparate sources of distant supervision, and realistically scales to hundreds of low-resource languages. The model exploits annotation projection, instance selection, tag dictionaries, morphological lexicons, and distributed representations, all in a uniform framework. The approach is simple, yet surprisingly effective, resulting in a new state of the art without access to any gold annotated data.- Anthology ID:
- D18-1061
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 614–620
- Language:
- URL:
- https://aclanthology.org/D18-1061
- DOI:
- 10.18653/v1/D18-1061
- Cite (ACL):
- Barbara Plank and Željko Agić. 2018. Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 614–620, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging (Plank & Agić, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/D18-1061.pdf
- Code
- bplank/bilstm-aux