Abstract
We present crowdsourced collection of error annotations for transcriptions of spoken learner English. Our emphasis in data collection is on fluency corrections, a more complete correction than has traditionally been aimed for in grammatical error correction research (GEC). Fluency corrections require improvements to the text, taking discourse and utterance level semantics into account: the result is a more naturalistic, holistic version of the original. We propose that this shifted emphasis be reflected in a new name for the task: ‘holistic error correction’ (HEC). We analyse crowdworker behaviour in HEC and conclude that the method is useful with certain amendments for future work.- Anthology ID:
- W17-5010
- Volume:
- Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Venue:
- BEA
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 91–100
- Language:
- URL:
- https://aclanthology.org/W17-5010
- DOI:
- 10.18653/v1/W17-5010
- Cite (ACL):
- Andrew Caines, Emma Flint, and Paula Buttery. 2017. Collecting fluency corrections for spoken learner English. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 91–100, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Collecting fluency corrections for spoken learner English (Caines et al., BEA 2017)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W17-5010.pdf
- Data
- CoNLL-2014 Shared Task: Grammatical Error Correction, FCE