A Process-oriented Dataset of Revisions during Writing

Rianne Conijn, Emily Dux Speltz, Menno van Zaanen, Luuk Van Waes, Evgeny Chukharev-Hudilainen


Abstract
Revision plays a major role in writing and the analysis of writing processes. Revisions can be analyzed using a product-oriented approach (focusing on a finished product, the text that has been produced) or a process-oriented approach (focusing on the process that the writer followed to generate this product). Although several language resources exist for the product-oriented approach to revisions, there are hardly any resources available yet for an in-depth analysis of the process of revisions. Therefore, we provide an extensive dataset on revisions made during writing (accessible via https://hdl.handle.net/10411/VBDYGX). This dataset is based on keystroke data and eye tracking data of 65 students from a variety of backgrounds (undergraduate and graduate English as a first language and English as a second language students) and a variety of tasks (argumentative text and academic abstract). In total, 7,120 revisions were identified in the dataset. For each revision, 18 features have been manually annotated and 31 features have been automatically extracted. As a case study, we show two potential use cases of the dataset. In addition, future uses of the dataset are described.
Anthology ID:
2020.lrec-1.45
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
363–368
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.45
DOI:
Bibkey:
Cite (ACL):
Rianne Conijn, Emily Dux Speltz, Menno van Zaanen, Luuk Van Waes, and Evgeny Chukharev-Hudilainen. 2020. A Process-oriented Dataset of Revisions during Writing. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 363–368, Marseille, France. European Language Resources Association.
Cite (Informal):
A Process-oriented Dataset of Revisions during Writing (Conijn et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2020.lrec-1.45.pdf