Exploring Limitations and Risks of LLM-Based Grammatical Error Correction for Indigenous Languages

Flammie A Pirinen, Linda Wiechetek


Abstract
Rule-based grammatical error correction has long been seen as the most effective way to create user-friendly end-user systems for gram- matical error correction (GEC). However, in the recent years the large language models and generative AI systems based on that technol- ogy have been progressed fast to challenge the traditional GEC approach. In this article we show which possibilities and limitations this approach bears for Indigenous languages that have more limited digital presence in the large language model data and a different literacy background than English. We show experi- ments in North Sámi, an Indigenous language of Northern Europe.
Anthology ID:
2025.computel-main.8
Volume:
Proceedings of the Eight Workshop on the Use of Computational Methods in the Study of Endangered Languages
Month:
March
Year:
2025
Address:
Honolulu, Hawaii, USA
Editors:
Jordan Lachler, Godfred Agyapong, Antti Arppe, Sarah Moeller, Aditi Chaudhary, Shruti Rijhwani, Daisy Rosenblum
Venues:
ComputEL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
74–81
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.computel-main.8/
DOI:
Bibkey:
Cite (ACL):
Flammie A Pirinen and Linda Wiechetek. 2025. Exploring Limitations and Risks of LLM-Based Grammatical Error Correction for Indigenous Languages. In Proceedings of the Eight Workshop on the Use of Computational Methods in the Study of Endangered Languages, pages 74–81, Honolulu, Hawaii, USA. Association for Computational Linguistics.
Cite (Informal):
Exploring Limitations and Risks of LLM-Based Grammatical Error Correction for Indigenous Languages (Pirinen & Wiechetek, ComputEL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.computel-main.8.pdf