2024
pdf
abs
Using Large Language Models to Assess Young Students’ Writing Revisions
Tianwen Li
|
Zhexiong Liu
|
Lindsay Matsumura
|
Elaine Wang
|
Diane Litman
|
Richard Correnti
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
Although effective revision is the crucial component of writing instruction, few automated writing evaluation (AWE) systems specifically focus on the quality of the revisions students undertake. In this study we investigate the use of a large language model (GPT-4) with Chain-of-Thought (CoT) prompting for assessing the quality of young students’ essay revisions aligned with the automated feedback messages they received. Results indicate that GPT-4 has significant potential for evaluating revision quality, particularly when detailed rubrics are included that describe common revision patterns shown by young writers. However, the addition of CoT prompting did not significantly improve performance. Further examination of GPT-4’s scoring performance across various levels of student writing proficiency revealed variable agreement with human ratings. The implications for improving AWE systems focusing on young students are discussed.
2023
pdf
abs
Predicting the Quality of Revisions in Argumentative Writing
Zhexiong Liu
|
Diane Litman
|
Elaine Wang
|
Lindsay Matsumura
|
Richard Correnti
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
The ability to revise in response to feedback is critical to students’ writing success. In the case of argument writing in specific, identifying whether an argument revision (AR) is successful or not is a complex problem because AR quality is dependent on the overall content of an argument. For example, adding the same evidence sentence could strengthen or weaken existing claims in different argument contexts (ACs). To address this issue we developed Chain-of-Thought prompts to facilitate ChatGPT-generated ACs for AR quality predictions. The experiments on two corpora, our annotated elementary essays and existing college essays benchmark, demonstrate the superiority of the proposed ACs over baselines.
2020
pdf
abs
Annotation and Classification of Evidence and Reasoning Revisions in Argumentative Writing
Tazin Afrin
|
Elaine Lin Wang
|
Diane Litman
|
Lindsay Clare Matsumura
|
Richard Correnti
Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications
Automated writing evaluation systems can improve students’ writing insofar as students attend to the feedback provided and revise their essay drafts in ways aligned with such feedback. Existing research on revision of argumentative writing in such systems, however, has focused on the types of revisions students make (e.g., surface vs. content) rather than the extent to which revisions actually respond to the feedback provided and improve the essay. We introduce an annotation scheme to capture the nature of sentence-level revisions of evidence use and reasoning (the ‘RER’ scheme) and apply it to 5th- and 6th-grade students’ argumentative essays. We show that reliable manual annotation can be achieved and that revision annotations correlate with a holistic assessment of essay improvement in line with the feedback provided. Furthermore, we explore the feasibility of automatically classifying revisions according to our scheme.
2015
pdf
Incorporating Coherence of Topics as a Criterion in Automatic Response-to-Text Assessment of the Organization of Writing
Zahra Rahimi
|
Diane Litman
|
Elaine Wang
|
Richard Correnti
Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications