COCOGEC: Counterfactual Generation for Robust Grammatical Error Correction
Qianyu Wang, Xiaoman Wang, Yuanyuan Liang, Xinyuan Li, Yunshi Lan
Abstract
Grammatical error correction (GEC) systems are usually trained and evaluated on GEC benchmarks, but their performance often drops sharply once the surrounding context is slightly perturbed or extended. This indicates that the existing GEC models usually fail to understand the error patterns in the varying contexts. In this paper, we thoroughly investigate the counterfactuals for GEC tasks, where the subtle changes to the contexts could lead to the label flipping issue. We address this robustness gap by viewing contextual variation through the lens of counterfactual data. We propose CoCoGEC, a counterfactual generation framework that creates copies of training instances with error-irrelevant contexts altered. Our framework systematically generates counterfactuals by (1) generating intra- and inter-sentence counterfactuals that maintain the error patterns as well as syntax of the original instances by altering the word-level and sentence-level contexts; (2) revising the generated counterfactuals by selecting the instances with flipped labels and high GEC Mutual Information (MI) coefficient. Extensive experiments show that our method substantially improves the stability of GEC models, outperforming a set of data augmentation baselines. Particularly, it could achieve absolute F0.5 gains of +9.9, +11.3, and +20.8 points on the perturbed BEA-19*,CoNLL-14*, and TEM-8* data set.Our code is released at https://github.com/Quinnok/CoCoGEC.- Anthology ID:
- 2026.findings-acl.195
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4004–4019
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.195/
- DOI:
- Cite (ACL):
- Qianyu Wang, Xiaoman Wang, Yuanyuan Liang, Xinyuan Li, and Yunshi Lan. 2026. COCOGEC: Counterfactual Generation for Robust Grammatical Error Correction. In Findings of the Association for Computational Linguistics: ACL 2026, pages 4004–4019, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- COCOGEC: Counterfactual Generation for Robust Grammatical Error Correction (Wang et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.195.pdf