Multilingual Projection for Parsing Truly Low-Resource Languages
Željko Agić, Anders Johannsen, Barbara Plank, Héctor Martínez Alonso, Natalie Schluter, Anders Søgaard
Abstract
We propose a novel approach to cross-lingual part-of-speech tagging and dependency parsing for truly low-resource languages. Our annotation projection-based approach yields tagging and parsing models for over 100 languages. All that is needed are freely available parallel texts, and taggers and parsers for resource-rich languages. The empirical evaluation across 30 test languages shows that our method consistently provides top-level accuracies, close to established upper bounds, and outperforms several competitive baselines.- Anthology ID:
- Q16-1022
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 4
- Month:
- Year:
- 2016
- Address:
- Cambridge, MA
- Editors:
- Lillian Lee, Mark Johnson, Kristina Toutanova
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 301–312
- Language:
- URL:
- https://aclanthology.org/Q16-1022
- DOI:
- 10.1162/tacl_a_00100
- Cite (ACL):
- Željko Agić, Anders Johannsen, Barbara Plank, Héctor Martínez Alonso, Natalie Schluter, and Anders Søgaard. 2016. Multilingual Projection for Parsing Truly Low-Resource Languages. Transactions of the Association for Computational Linguistics, 4:301–312.
- Cite (Informal):
- Multilingual Projection for Parsing Truly Low-Resource Languages (Agić et al., TACL 2016)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/Q16-1022.pdf