Grammatical Error Correction via Mixed-Grained Weighted Training
Jiahao Li, Quan Wang, Chiwei Zhu, Zhendong Mao, Yongdong Zhang
Abstract
The task of Grammatical Error Correction (GEC) aims to automatically correct grammatical errors in natural texts. Almost all previous works treat annotated training data equally, but inherent discrepancies in data are neglected. In this paper, the inherent discrepancies are manifested in two aspects, namely, accuracy of data annotation and diversity of potential annotations. To this end, we propose MainGEC, which designs token-level and sentence-level training weights based on inherent discrepancies therein, and then conducts mixed-grained weighted training to improve the training effect for GEC. Empirical evaluation shows that whether in the Seq2Seq or Seq2Edit manner, MainGEC achieves consistent and significant performance improvements on two benchmark datasets, demonstrating the effectiveness and superiority of the mixed-grained weighted training. Further ablation experiments verify the effectiveness of designed weights for both granularities in MainGEC.- Anthology ID:
- 2023.findings-emnlp.400
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6027–6037
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.400
- DOI:
- 10.18653/v1/2023.findings-emnlp.400
- Cite (ACL):
- Jiahao Li, Quan Wang, Chiwei Zhu, Zhendong Mao, and Yongdong Zhang. 2023. Grammatical Error Correction via Mixed-Grained Weighted Training. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 6027–6037, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Grammatical Error Correction via Mixed-Grained Weighted Training (Li et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.findings-emnlp.400.pdf