Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory
Vrund Dobariya, Jatayu Baxi, Bhavika Gambhava, Brijesh Bhatt
Abstract
Grammatical Error Correction (GEC) is a fundamental task in Natural Language Processing that focuses on automatically detecting and correcting grammatical errors in text. In this paper, we present a novel approach for GEC for Gujarati. Gujarati is an Indian language spoken by over 55 million people worldwide. Our approach combines a large language model with non-parametric memory modules to address the low-resource challenge. We have evaluated our system on human-annotated and synthetic datasets. The overall result indicates promising results for Gujarati. The proposed approach is generic enough to be adopted by other languages. Furthermore, we release a publicly available evaluation dataset for Gujarati GEC along with an adapted version of the ERRANT framework to enable error-type-wise evaluation in Gujarati.- Anthology ID:
- 2025.findings-ijcnlp.28
- Volume:
- Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
- Month:
- December
- Year:
- 2025
- Address:
- Mumbai, India
- Editors:
- Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
- Venue:
- Findings
- SIG:
- Publisher:
- The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
- Note:
- Pages:
- 473–485
- Language:
- URL:
- https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.28/
- DOI:
- Cite (ACL):
- Vrund Dobariya, Jatayu Baxi, Bhavika Gambhava, and Brijesh Bhatt. 2025. Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 473–485, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
- Cite (Informal):
- Smruti: Grammatical Error Correction for Gujarati using LLMs with Non-Parametric Memory (Dobariya et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.28.pdf