Akshay Ramesh


2025

Grammatical Error Correction (GEC) for low-resource Indic languages remains challenging due to limited annotated data and morphological complexity. We present a hybrid neurosymbolic GEC system that combines neural sequence-to-sequence models with explicit language-specific rule-based pattern matching. Our approach leverages parameter-efficient LoRA adaptation on aggressively augmented data to fine-tune pre-trained mT5 models, followed by learned correction rules through intelligent ensemble strategies. The proposed hybrid architecture achieved 85.34% GLEU for Tamil (Rank 8) and 95.06% GLEU for Malayalam (Rank 2) on the provided IndicGEC test sets, outperforming individual neural and rule-based approaches. The system incorporates conservative safety mechanisms to prevent catastrophic deletions and over-corrections, thus ensuring robustness and real-world applicability. Our work demonstrates that extremely low-resource GEC can be effectively addressed by combining neural generalization with symbolic precision.