Closing the NLP Gap: Documentary Linguistics and NLP Need a Shared Software Infrastructure

Luke Gessler


Abstract
For decades, researchers in natural language processing and computational linguistics have been developing models and algorithms that aim to serve the needs of language documentation projects. However, these models have seen little use in language documentation despite their great potential for making documentary linguistic artefacts better and easier to produce. In this work, we argue that a major reason for this NLP gap is the lack of a strong foundation of application software which can on the one hand serve the complex needs of language documentation and on the other hand provide effortless integration with NLP models. We further present and describe a work-in-progress system we have developed to serve this need, Glam.
Anthology ID:
2022.computel-1.15
Volume:
Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ComputEL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
119–126
Language:
URL:
https://aclanthology.org/2022.computel-1.15
DOI:
10.18653/v1/2022.computel-1.15
Bibkey:
Cite (ACL):
Luke Gessler. 2022. Closing the NLP Gap: Documentary Linguistics and NLP Need a Shared Software Infrastructure. In Proceedings of the Fifth Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 119–126, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Closing the NLP Gap: Documentary Linguistics and NLP Need a Shared Software Infrastructure (Gessler, ComputEL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.computel-1.15.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2022.computel-1.15.mp4