John Ruttan
2025
Advancing Uto-Aztecan Language Technologies: A Case Study on the Endangered Comanche Language
Jesus Alvarez C
|
Daua Karajeanes
|
Ashley Prado
|
John Ruttan
|
Ivory Yang
|
Sean O’brien
|
Vasu Sharma
|
Kevin Zhu
Proceedings of the Fifth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
The digital exclusion of endangered languages remains a critical challenge in NLP, limiting both linguistic research and revitalization efforts. This study introduces the first computational investigation of Comanche, an Uto-Aztecan language on the verge of extinction, demonstrating how minimal-cost, community-informed NLP interventions can support language preservation. We present a manually curated dataset of 412 phrases, a synthetic data generation pipeline, and an empirical evaluation of GPT-4o and GPT-4o-mini for language identification. Our experiments reveal that while LLMs struggle with Comanche in zero-shot settings, few-shot prompting significantly improves performance, achieving near-perfect accuracy with just five examples. Our findings highlight the potential of targeted NLP methodologies in low-resource contexts and emphasize that visibility is the first step toward inclusion. By establishing a foundation for Comanche in NLP, we advocate for computational approaches that prioritize accessibility, cultural sensitivity, and community engagement.
Search
Fix data
Co-authors
- Jesus Alvarez C 1
- Daua Karajeanes 1
- Sean O’Brien 1
- Ashley Prado 1
- Vasu Sharma 1
- show all...