Frustratingly Easy System Combination for Grammatical Error Correction

Muhammad Qorib, Seung-Hoon Na, Hwee Tou Ng


Abstract
In this paper, we formulate system combination for grammatical error correction (GEC) as a simple machine learning task: binary classification. We demonstrate that with the right problem formulation, a simple logistic regression algorithm can be highly effective for combining GEC models. Our method successfully increases the F0.5 score from the highest base GEC system by 4.2 points on the CoNLL-2014 test set and 7.2 points on the BEA-2019 test set. Furthermore, our method outperforms the state of the art by 4.0 points on the BEA-2019 test set, 1.2 points on the CoNLL-2014 test set with original annotation, and 3.4 points on the CoNLL-2014 test set with alternative annotation. We also show that our system combination generates better corrections with higher F0.5 scores than the conventional ensemble.
Anthology ID:
2022.naacl-main.143
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1964–1974
Language:
URL:
https://aclanthology.org/2022.naacl-main.143
DOI:
10.18653/v1/2022.naacl-main.143
Bibkey:
Cite (ACL):
Muhammad Qorib, Seung-Hoon Na, and Hwee Tou Ng. 2022. Frustratingly Easy System Combination for Grammatical Error Correction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1964–1974, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Frustratingly Easy System Combination for Grammatical Error Correction (Qorib et al., NAACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.naacl-main.143.pdf
Software:
 2022.naacl-main.143.software.zip
Video:
 https://preview.aclanthology.org/ingestion-script-update/2022.naacl-main.143.mp4
Code
 nusnlp/esc
Data
WI-LOCNESS