Generation of Synthetic Error Data of Verb Order Errors for Swedish

Judit Casademont Moner, Elena Volodina


Abstract
We report on our work-in-progress to generate a synthetic error dataset for Swedish by replicating errors observed in the authentic error annotated dataset. We analyze a small subset of authentic errors, capture regular patterns based on parts of speech, and design a set of rules to corrupt new data. We explore the approach and identify its capabilities, advantages and limitations as a way to enrich the existing collection of error-annotated data. This work focuses on word order errors, specifically those involving the placement of finite verbs in a sentence.
Anthology ID:
2022.bea-1.6
Volume:
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
Month:
July
Year:
2022
Address:
Seattle, Washington
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–38
Language:
URL:
https://aclanthology.org/2022.bea-1.6
DOI:
10.18653/v1/2022.bea-1.6
Bibkey:
Cite (ACL):
Judit Casademont Moner and Elena Volodina. 2022. Generation of Synthetic Error Data of Verb Order Errors for Swedish. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 33–38, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Generation of Synthetic Error Data of Verb Order Errors for Swedish (Casademont Moner & Volodina, BEA 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2022.bea-1.6.pdf
Video:
 https://preview.aclanthology.org/auto-file-uploads/2022.bea-1.6.mp4