Abstract
Since the end of the CoNLL-2014 shared task on grammatical error correction (GEC), research into language model (LM) based approaches to GEC has largely stagnated. In this paper, we re-examine LMs in GEC and show that it is entirely possible to build a simple system that not only requires minimal annotated data (∼1000 sentences), but is also fairly competitive with several state-of-the-art systems. This approach should be of particular interest for languages where very little annotated training data exists, although we also hope to use it as a baseline to motivate future research.- Anthology ID:
- W18-0529
- Volume:
- Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Editors:
- Joel Tetreault, Jill Burstein, Ekaterina Kochmar, Claudia Leacock, Helen Yannakoudakis
- Venue:
- BEA
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 247–253
- Language:
- URL:
- https://aclanthology.org/W18-0529
- DOI:
- 10.18653/v1/W18-0529
- Cite (ACL):
- Christopher Bryant and Ted Briscoe. 2018. Language Model Based Grammatical Error Correction without Annotated Training Data. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 247–253, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Language Model Based Grammatical Error Correction without Annotated Training Data (Bryant & Briscoe, BEA 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/W18-0529.pdf
- Data
- CoNLL-2014 Shared Task: Grammatical Error Correction, FCE, JFLEG