Towards Open-Ended Discovery for Low-Resource NLP

Bonaventure F. P. Dossou, Henri Aïdasso


Abstract
Natural language processing (NLP) for low-resource languages remains fundamentally constrained by the lack of textual corpora, standardized orthographies, and scalable annotation pipelines. While recent advances in large language models have improved cross-lingual transfer, they remain inaccessible to underrepresented communities due to their reliance on massive, pre-collected data and centralized infrastructure. In this position paper, we argue for a paradigm shift toward open-ended, interactive language discovery, where AI systems learn new languages dynamically through dialogue rather than static datasets. We contend that the future of language technology, particularly for low-resource and under-documented languages, must move beyond static data collection pipelines toward interactive, uncertainty-driven discovery, where learning emerges dynamically from human-machine collaboration instead of being limited to pre-existing datasets. We propose a framework grounded in joint human-machine uncertainty, combining epistemic uncertainty from the model with hesitation cues and confidence signals from human speakers to guide interaction, query selection, and memory retention. This paper is a call to action: we advocate a rethinking of how AI engages with human knowledge in under-documented languages, moving from extractive data collection toward participatory, co-adaptive learning processes that respect and empower communities while discovering and preserving the world’s linguistic diversity. This vision aligns with principles of human-centered AI and participatory design, emphasizing interactive, cooperative model building between AI systems and speakers.
Anthology ID:
2025.uncertainlp-main.24
Volume:
Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP 2025)
Month:
November
Year:
2025
Address:
Suzhou, China
Editor:
Noidea Noidea
Venues:
UncertaiNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
287–297
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.uncertainlp-main.24/
DOI:
Bibkey:
Cite (ACL):
Bonaventure F. P. Dossou and Henri Aïdasso. 2025. Towards Open-Ended Discovery for Low-Resource NLP. In Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP 2025), pages 287–297, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Towards Open-Ended Discovery for Low-Resource NLP (F. P. Dossou & Aïdasso, UncertaiNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.uncertainlp-main.24.pdf