Endangered Languages meet Modern NLP
Antonios Anastasopoulos, Christopher Cox, Graham Neubig, Hilaria Cruz
Abstract
This tutorial will focus on NLP for endangered languages documentation and revitalization. First, we will acquaint the attendees with the process and the challenges of language documentation, showing how the needs of the language communities and the documentary linguists map to specific NLP tasks. We will then present the state-of-the-art in NLP applied in this particularly challenging setting (extremely low-resource datasets, noisy transcriptions, limited annotations, non-standard orthographies). In doing so, we will also analyze the challenges of working in this domain and expand on both the capabilities and the limitations of current NLP approaches. Our ultimate goal is to motivate more NLP practitioners to work towards this very important direction, and also provide them with the tools and understanding of the limitations/challenges, both of which are needed in order to have an impact.- Anthology ID:
- 2020.coling-tutorials.7
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics: Tutorial Abstracts
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee for Computational Linguistics
- Note:
- Pages:
- 39–45
- Language:
- URL:
- https://aclanthology.org/2020.coling-tutorials.7
- DOI:
- 10.18653/v1/2020.coling-tutorials.7
- Cite (ACL):
- Antonios Anastasopoulos, Christopher Cox, Graham Neubig, and Hilaria Cruz. 2020. Endangered Languages meet Modern NLP. In Proceedings of the 28th International Conference on Computational Linguistics: Tutorial Abstracts, pages 39–45, Barcelona, Spain (Online). International Committee for Computational Linguistics.
- Cite (Informal):
- Endangered Languages meet Modern NLP (Anastasopoulos et al., COLING 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.coling-tutorials.7.pdf