Zhaohong Wan

2020

pdf abs
Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation
Zhaohong Wan | Xiaojun Wan | Wenguang Wang
Proceedings of the 28th International Conference on Computational Linguistics

The incorporation of data augmentation method in grammatical error correction task has attracted much attention. However, existing data augmentation methods mainly apply noise to tokens, which leads to the lack of diversity of generated errors. In view of this, we propose a new data augmentation method that can apply noise to the latent representation of a sentence.By editing the latent representations of grammatical sentences, we can generate synthetic samples with various error types. Combining with some pre-defined rules, our method can greatly improve the performance and robustness of existing grammatical error correction models. We evaluate our method on public benchmarks of GEC task and it achieves the state-of-the-art performance on CoNLL-2014 and FCE benchmarks.

Co-authors

Xiaojun Wan 1
Wenguang Wang 1

Venues

coling1