Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory

Vrund Dobariya, Jatayu Baxi, Bhavika Gambhava, Brijesh Bhatt


Abstract
Grammatical Error Correction (GEC) is a fundamental task in Natural Language Processing that focuses on automatically detecting and correcting grammatical errors in text. In this paper, we present a novel approach for GEC for Gujarati. Gujarati is an Indian language spoken by over 55 million people worldwide. Our approach combines a large language model with non-parametric memory modules to address the low-resource challenge. We have evaluated our system on human-annotated and synthetic datasets. The overall result indicates promising results for Gujarati. The proposed approach is generic enough to be adopted by other languages. Furthermore, we release a publicly available evaluation dataset for Gujarati GEC along with an adapted version of the ERRANT framework to enable error-type-wise evaluation in Gujarati.
Anthology ID:
2025.findings-ijcnlp.28
Volume:
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venue:
Findings
SIG:
Publisher:
The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:
473–485
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.28/
DOI:
Bibkey:
Cite (ACL):
Vrund Dobariya, Jatayu Baxi, Bhavika Gambhava, and Brijesh Bhatt. 2025. Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 473–485, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):
Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory (Dobariya et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.28.pdf