Edit-Aware Reward Modeling for Chinese Grammatical Error Correction

Yilin Li, Xiaojun Wan


Abstract
While large language models have achieved remarkable success in various natural language processing tasks, their potential in grammatical error correction remains underexplored. Recent work has applied reinforcement learning with rule-based rewards to CGEC, but these approaches rely on coarse-grained binary signals (exact match or not) that fail to capture fine-grained quality distinctions among correction candidates. In this paper, we propose Edit-Aware Reward Model (EARM), a novel reward modeling framework that explicitly incorporates edit-awareness into preference learning for CGEC. EARM introduces a dual-granularity training objective that jointly optimizes sentence-level and token-level weighted Bradley-Terry ranking losses, where edit tokens receive higher importance weights. When integrated with GRPO, our approach achieves 61.29/63.08 on FCGEC/NaCGEC (single output), and 65.04/64.59 with best-of-16 reranking, surpassing previous best by 5.41 and 1.80 points. Extensive experiments demonstrate that learned edit-aware rewards significantly outperform rule-based alternatives for CGEC preference optimization.
Anthology ID:
2026.acl-long.1900
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
40945–40957
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1900/
DOI:
Bibkey:
Cite (ACL):
Yilin Li and Xiaojun Wan. 2026. Edit-Aware Reward Modeling for Chinese Grammatical Error Correction. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 40945–40957, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Edit-Aware Reward Modeling for Chinese Grammatical Error Correction (Li & Wan, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1900.pdf
Checklist:
 2026.acl-long.1900.checklist.pdf