Abstract
Until now, grammatical error correction (GEC) has been primarily evaluated on text written by non-native English speakers, with a focus on student essays. This paper enables GEC development on text written by native speakers by providing a new data set and metric. We present a multiple-reference test corpus for GEC that includes 4,000 sentences in two new domains (formal and informal writing by native English speakers) and 2,000 sentences from a diverse set of non-native student writing. We also collect human judgments of several GEC systems on this new test set and perform a meta-evaluation, assessing how reliable automatic metrics are across these domains. We find that commonly used GEC metrics have inconsistent performance across domains, and therefore we propose a new ensemble metric that is robust on all three domains of text.- Anthology ID:
 - Q19-1032
 - Volume:
 - Transactions of the Association for Computational Linguistics, Volume 7
 - Month:
 - Year:
 - 2019
 - Address:
 - Cambridge, MA
 - Editors:
 - Lillian Lee, Mark Johnson, Brian Roark, Ani Nenkova
 - Venue:
 - TACL
 - SIG:
 - Publisher:
 - MIT Press
 - Note:
 - Pages:
 - 551–566
 - Language:
 - URL:
 - https://aclanthology.org/Q19-1032
 - DOI:
 - 10.1162/tacl_a_00282
 - Cite (ACL):
 - Courtney Napoles, Maria Nădejde, and Joel Tetreault. 2019. Enabling Robust Grammatical Error Correction in New Domains: Data Sets, Metrics, and Analyses. Transactions of the Association for Computational Linguistics, 7:551–566.
 - Cite (Informal):
 - Enabling Robust Grammatical Error Correction in New Domains: Data Sets, Metrics, and Analyses (Napoles et al., TACL 2019)
 - PDF:
 - https://preview.aclanthology.org/ingest-acl-2023-videos/Q19-1032.pdf
 - Data
 - GMEG-wiki, GMEG-yahoo