Grammar Error Correction in Morphologically Rich Languages: The Case of Russian

Alla Rozovskaya, Dan Roth


Abstract
Until now, most of the research in grammar error correction focused on English, and the problem has hardly been explored for other languages. We address the task of correcting writing mistakes in morphologically rich languages, with a focus on Russian. We present a corrected and error-tagged corpus of Russian learner writing and develop models that make use of existing state-of-the-art methods that have been well studied for English. Although impressive results have recently been achieved for grammar error correction of non-native English writing, these results are limited to domains where plentiful training data are available. Because annotation is extremely costly, these approaches are not suitable for the majority of domains and languages. We thus focus on methods that use “minimal supervision”; that is, those that do not rely on large amounts of annotated training data, and show how existing minimal-supervision approaches extend to a highly inflectional language such as Russian. The results demonstrate that these methods are particularly useful for correcting mistakes in grammatical phenomena that involve rich morphology.
Anthology ID:
Q19-1001
Volume:
Transactions of the Association for Computational Linguistics, Volume 7
Month:
Year:
2019
Address:
Cambridge, MA
Editors:
Lillian Lee, Mark Johnson, Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
1–17
Language:
URL:
https://aclanthology.org/Q19-1001
DOI:
10.1162/tacl_a_00251
Bibkey:
Cite (ACL):
Alla Rozovskaya and Dan Roth. 2019. Grammar Error Correction in Morphologically Rich Languages: The Case of Russian. Transactions of the Association for Computational Linguistics, 7:1–17.
Cite (Informal):
Grammar Error Correction in Morphologically Rich Languages: The Case of Russian (Rozovskaya & Roth, TACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/Q19-1001.pdf
Data
FCEJFLEG