Ben Naismith


2025

pdf bib
Identifying and analyzing ‘noisy’ spelling errors in a second language corpus
Alan Juffs | Ben Naismith
Proceedings of the Tenth Workshop on Noisy and User-generated Text

This paper addresses the problem of identifying and analyzing ‘noisy’ spelling errors in texts written by second language (L2) learners’ texts in a written corpus. Using Python, spelling errors were identified in 5774 texts greater than or equal to 66 words (total=1,814,209 words), selected from a corpus of 4.2 million words (Authors-1). The statistical analysis used hurdle() models in R, which are appropriate for non-normal, count data, with many zeros.

2023

pdf bib
Automated evaluation of written discourse coherence using GPT-4
Ben Naismith | Phoebe Mulcaire | Jill Burstein
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

The popularization of large language models (LLMs) such as OpenAI’s GPT-3 and GPT-4 have led to numerous innovations in the field of AI in education. With respect to automated writing evaluation (AWE), LLMs have reduced challenges associated with assessing writing quality characteristics that are difficult to identify automatically, such as discourse coherence. In addition, LLMs can provide rationales for their evaluations (ratings) which increases score interpretability and transparency. This paper investigates one approach to producing ratings by training GPT-4 to assess discourse coherence in a manner consistent with expert human raters. The findings of the study suggest that GPT-4 has strong potential to produce discourse coherence ratings that are comparable to human ratings, accompanied by clear rationales. Furthermore, the GPT-4 ratings outperform traditional NLP coherence metrics with respect to agreement with human ratings. These results have implications for advancing AWE technology for learning and assessment.