The Missing Middle: Language Documentation Needs Better Infrastructure, Not Better Models

Luke Gessler, Antonios Anastasopoulos, Sandra Auderset, Timotheus Bodt, Shobhana Chelliah, Sebastien Christian, Maxime Fily, Santiago Herrera, Eva Huber, Sharid Loaiciga, Marieke Meelen, Robert Östling, Alexis Palmer, Eline Visser


Abstract
Despite decades of progress in human language technology (HLT) and growing research interest in endangered languages, practical uptake of HLT in documentary linguistics workflows remains rare. In this opinion piece, we report on a structured dialogue among approximately twenty academics convened to diagnose why this gap persists. Across all topics, we identify a recurring structural problem, which we call the missing middle: despite the existence of many potentially useful HLTs, the connective infrastructure necessary to make them genuinely accessible to linguists and language communities does not exist. We report the details of our discussion and make four specific recommendations for how those active in language documentation and HLT research might orient their future work.
Anthology ID:
2026.computel-1.15
Volume:
Proceedings of the Ninth Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL-9)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Godfred Agyapong, Sarah Moeller, Antti Arppe, Ali Marashian, Daisy Rosenblum
Venues:
ComputEL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
136–147
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.computel-1.15/
DOI:
Bibkey:
Cite (ACL):
Luke Gessler, Antonios Anastasopoulos, Sandra Auderset, Timotheus Bodt, Shobhana Chelliah, Sebastien Christian, Maxime Fily, Santiago Herrera, Eva Huber, Sharid Loaiciga, Marieke Meelen, Robert Östling, Alexis Palmer, and Eline Visser. 2026. The Missing Middle: Language Documentation Needs Better Infrastructure, Not Better Models. In Proceedings of the Ninth Workshop on the Use of Computational Methods in the Study of Endangered Languages (ComputEL-9), pages 136–147, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
The Missing Middle: Language Documentation Needs Better Infrastructure, Not Better Models (Gessler et al., ComputEL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.computel-1.15.pdf
Supplementarymaterial:
 2026.computel-1.15.SupplementaryMaterial.txt