Advancing Uto-Aztecan Language Technologies: A Case Study on the Endangered Comanche Language
Jesus Alvarez C, Daua Karajeanes, Ashley Prado, John Ruttan, Ivory Yang, Sean O’brien, Vasu Sharma, Kevin Zhu
Abstract
The digital exclusion of endangered languages remains a critical challenge in NLP, limiting both linguistic research and revitalization efforts. This study introduces the first computational investigation of Comanche, an Uto-Aztecan language on the verge of extinction, demonstrating how minimal-cost, community-informed NLP interventions can support language preservation. We present a manually curated dataset of 412 phrases, a synthetic data generation pipeline, and an empirical evaluation of GPT-4o and GPT-4o-mini for language identification. Our experiments reveal that while LLMs struggle with Comanche in zero-shot settings, few-shot prompting significantly improves performance, achieving near-perfect accuracy with just five examples. Our findings highlight the potential of targeted NLP methodologies in low-resource contexts and emphasize that visibility is the first step toward inclusion. By establishing a foundation for Comanche in NLP, we advocate for computational approaches that prioritize accessibility, cultural sensitivity, and community engagement.- Anthology ID:
- 2025.americasnlp-1.4
- Volume:
- Proceedings of the Fifth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
- Month:
- May
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Manuel Mager, Abteen Ebrahimi, Robert Pugh, Shruti Rijhwani, Katharina Von Der Wense, Luis Chiruzzo, Rolando Coto-Solano, Arturo Oncevay
- Venues:
- AmericasNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 27–37
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.americasnlp-1.4/
- DOI:
- Cite (ACL):
- Jesus Alvarez C, Daua Karajeanes, Ashley Prado, John Ruttan, Ivory Yang, Sean O’brien, Vasu Sharma, and Kevin Zhu. 2025. Advancing Uto-Aztecan Language Technologies: A Case Study on the Endangered Comanche Language. In Proceedings of the Fifth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP), pages 27–37, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Advancing Uto-Aztecan Language Technologies: A Case Study on the Endangered Comanche Language (Alvarez C et al., AmericasNLP 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.americasnlp-1.4.pdf