Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

Roman Grundkiewicz, Marcin Junczys-Dowmunt


Abstract
We combine two of the most popular approaches to automated Grammatical Error Correction (GEC): GEC based on Statistical Machine Translation (SMT) and GEC based on Neural Machine Translation (NMT). The hybrid system achieves new state-of-the-art results on the CoNLL-2014 and JFLEG benchmarks. This GEC system preserves the accuracy of SMT output and, at the same time, generates more fluent sentences as it typical for NMT. Our analysis shows that the created systems are closer to reaching human-level performance than any other GEC system reported so far.
Anthology ID:
N18-2046
Volume:
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marilyn Walker, Heng Ji, Amanda Stent
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
284–290
Language:
URL:
https://aclanthology.org/N18-2046
DOI:
10.18653/v1/N18-2046
Bibkey:
Cite (ACL):
Roman Grundkiewicz and Marcin Junczys-Dowmunt. 2018. Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 284–290, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation (Grundkiewicz & Junczys-Dowmunt, NAACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/N18-2046.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/N18-2046.mp4
Data
CoNLLCoNLL-2014 Shared Task: Grammatical Error CorrectionJFLEG