Mikhail Zolotilin

2026

FinnGEC: Benchmarking Grammatical Error Correction for Finnish
Anh-Duc Vu | Mikhail Zolotilin | Jue Hou | Anisia Katinskaia | Yiheng Wu | Roman Yangarber
Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)

Grammatical error correction (GEC) is a natural language processing task critical for improving language quality, supporting communication efficacy, and for language learning and teaching. To date, most research in GEC has focused on major, resource-rich languages such as English, while lower-resource languages remain underexplored. In this paper, we focus on GEC for Finnish. We build a dataset based on data from real-world language learners. We explore various approaches to GEC, including fine-tuning transformer models and zero-shot LLM prompting. We also adapt ERRANT, a popular GEC evaluation tool, for the Finnish language, to evaluate the performance of the models. Our results indicate that the performance of GEC for Finnish is promising, but requires further research. To the best of our knowledge, this is the first in-depth exploration of GEC for Finnish; we provide benchmarks, datasets, and code for GEC for Finnish—by releasing our training and test data and the code for Finnish ERRANT—to support further research on this important task.

Co-authors

Venues

BEA1
WS1

Fix author