Abstract
We introduce a novel position offset label prediction subtask to the encoder-decoder architecture for grammatical error correction (GEC) task. To keep the meaning of the input sentence unchanged, only a few words should be inserted or deleted during correction, and most of tokens in the erroneous sentence appear in the paired correct sentence with limited position movement. Inspired by this observation, we design an auxiliary task to predict position offset label (POL) of tokens, which is naturally capable of integrating different correction editing operations into a unified framework. Based on the predicted POL, we further propose a new copy mechanism (P-copy) to replace the vanilla copy module. Experimental results on Chinese, English and Japanese datasets demonstrate that our proposed POL-Pc framework obviously improves the performance of baseline models. Moreover, our model yields consistent performance gain over various data augmentation methods. Especially, after incorporating synthetic data, our model achieves a 38.95 F-0.5 score on Chinese GEC dataset, which outperforms the previous state-of-the-art by a wide margin of 1.98 points.- Anthology ID:
- 2022.coling-1.480
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 5409–5418
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.480
- DOI:
- Cite (ACL):
- Xiuyu Wu, Jingsong Yu, Xu Sun, and Yunfang Wu. 2022. Position Offset Label Prediction for Grammatical Error Correction. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5409–5418, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- Position Offset Label Prediction for Grammatical Error Correction (Wu et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2022.coling-1.480.pdf