A Closer Look at k-Nearest Neighbors Grammatical Error Correction

Justin Vasselli, Taro Watanabe


Abstract
In various natural language processing tasks, such as named entity recognition and machine translation, example-based approaches have been used to improve performance by leveraging existing knowledge. However, the effectiveness of this approach for Grammatical Error Correction (GEC) is unclear. In this work, we explore how an example-based approach affects the accuracy and interpretability of the output of GEC systems and the trade-offs involved. The approach we investigate has shown great promise in machine translation by using the $k$-nearest translation examples to improve the results of a pretrained Transformer model. We find that using this technique increases precision by reducing the number of false positives, but recall suffers as the model becomes more conservative overall. Increasing the number of example sentences in the datastore does lead to better performing systems, but with diminishing returns and a high decoding cost. Synthetic data can be used as examples, but the effectiveness varies depending on the base model. Finally, we find that finetuning on a set of data may be more effective than using that data during decoding as examples.
Anthology ID:
2023.bea-1.19
Volume:
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Ekaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anaïs Tack, Victoria Yaneva, Zheng Yuan, Torsten Zesch
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
220–231
Language:
URL:
https://aclanthology.org/2023.bea-1.19
DOI:
10.18653/v1/2023.bea-1.19
Bibkey:
Cite (ACL):
Justin Vasselli and Taro Watanabe. 2023. A Closer Look at k-Nearest Neighbors Grammatical Error Correction. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 220–231, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Closer Look at k-Nearest Neighbors Grammatical Error Correction (Vasselli & Watanabe, BEA 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2023.bea-1.19.pdf
Video:
 https://preview.aclanthology.org/dois-2013-emnlp/2023.bea-1.19.mp4