akhilrajeevp at BHASHA Task 1: Minimal-Edit Instruction Tuning for Low-Resource Indic GEC

Akhil Rajeev P


Abstract
Grammatical error correction for Indic languages faces limited supervision, diverse scripts, and rich morphology. We propose an augmentation-free setup that uses instructiontuned large language models and conservative decoding. A 12B GEMMA 3 model is instruction-tuned in bnb 4 bit precision with Parameter efficient Finetuning and Alpaca-style formatting. Decoding follows a deterministic, constraint-aware procedure with a lightweight normaliser that encourages minimal, meaningpreserving edits. We operationalise inference, subsequent to instruction fine-tuning (IFT), via a fixed, language-specific prompt directly synthesised from a deterministic error classifier’s taxonomy, label distributions, and precedence ordering computed on the training corpus. Under the official untuned GLEU evaluation, the system scores 92.41 on Malayalam, sixth overall, and 81.44 on Hindi, third overall. These results indicate that classifier-informed prompt design, adapter-based instruction tuning, and deterministic decoding provide a reproducible and computationally efficient alternative to augmentation-centred pipelines for Indic GEC. The approach also motivates future work on stronger morphosyntactic constraints and human centered evaluation of conservative edits.
Anthology ID:
2025.bhasha-1.17
Volume:
Proceedings of the 1st Workshop on Benchmarks, Harmonization, Annotation, and Standardization for Human-Centric AI in Indian Languages (BHASHA 2025)
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Arnab Bhattacharya, Pawan Goyal, Saptarshi Ghosh, Kripabandhu Ghosh
Venues:
BHASHA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
164–174
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.bhasha-1.17/
DOI:
Bibkey:
Cite (ACL):
Akhil Rajeev P. 2025. akhilrajeevp at BHASHA Task 1: Minimal-Edit Instruction Tuning for Low-Resource Indic GEC. In Proceedings of the 1st Workshop on Benchmarks, Harmonization, Annotation, and Standardization for Human-Centric AI in Indian Languages (BHASHA 2025), pages 164–174, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):
akhilrajeevp at BHASHA Task 1: Minimal-Edit Instruction Tuning for Low-Resource Indic GEC (P, BHASHA 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.bhasha-1.17.pdf