Evaluating Frontier LLM Translation Capability for Lakota

Lance Robertson

Evaluating Frontier LLM Translation Capability for Lakota

Abstract

We evaluate seven large language models—four proprietary and three open-weight—on bidirectional Lakota–English translation using 200 sentence pairs from the New Lakota Dictionary. Each model is evaluated with and without extended reasoning, where the provider’s API permits. The best model (Gemini 3.1 Pro) achieves a mean chrF++ of 59.4 on Lakota→English and 42.6 on English→Lakota; the strongest open-weight model trails the proprietary leaders, and no model produces reliable translation in either direction. Two independent LLM judges from different model families agree substantially (Cohen’s κ=0.75) that semantic equivalence ranges from 6% (GPT-5.2) to 60% (Gemini), diverging substantially from chrF++ scores. For the open-weight models, enabling reasoning changes refusal behavior far more than translation quality: it surfaces the limitation rather than overcoming it. Diacritic-normalization analysis shows models produce roughly correct base characters but place diacritical marks inconsistently. All results and evaluation code are publicly available at https://github.com/robotson/lakota-translation-benchmark.

Anthology ID:: 2026.americasnlp-6.2
Volume:: Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Manuel Mager, Abteen Ebrahimi, Minh Duc Bui, Robert Pugh, Arturo Oncevay, Luis Chiruzzo, Rolando Coto Solano, Shruti Rijhwani, Katharina Von Der Wense
Venues:: AmericasNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11–21
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.americasnlp-6.2/
DOI:
Bibkey:
Cite (ACL):: Lance Robertson. 2026. Evaluating Frontier LLM Translation Capability for Lakota. In Proceedings of the Sixth Workshop on NLP for Indigenous Languages of the Americas (AmericasNLP), pages 11–21, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Evaluating Frontier LLM Translation Capability for Lakota (Robertson, AmericasNLP 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.americasnlp-6.2.pdf
Supplementarymaterial:: 2026.americasnlp-6.2.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Fix data