Lucy Skidmore


2025

pdf bib
Transformer Architectures for Vocabulary Test Item Difficulty Prediction
Lucy Skidmore | Mariano Felice | Karen Dunn
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

Establishing the difficulty of test items is an essential part of the language assessment development process. However, traditional item calibration methods are often time-consuming and difficult to scale. To address this, recent research has explored natural language processing (NLP) approaches for automatically predicting item difficulty from text. This paper investigates the use of transformer models to predict the difficulty of second language (L2) English vocabulary test items that have multilingual prompts. We introduce an extended version of the British Council’s Knowledge-based Vocabulary Lists (KVL) dataset, containing 6,768 English words paired with difficulty scores and question prompts written in Spanish, German, and Mandarin Chinese. Using this new dataset for fine-tuning, we explore various transformer-based architectures. Our findings show that a multilingual model jointly trained on all L1 subsets of the KVL achieve the best results, with analysis suggesting that the model is able to learn global patterns of cross-linguistic influence on target word difficulty. This study establishes a foundation for NLP-based item difficulty estimation using the KVL dataset, providing actionable insights for developing multilingual test items.

2022

pdf bib
Incremental Disfluency Detection for Spoken Learner English
Lucy Skidmore | Roger Moore
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

Incremental disfluency detection provides a framework for computing communicative meaning from hesitations, repetitions and false starts commonly found in speech. One application of this area of research is in dialogue-based computer-assisted language learning (CALL), where detecting learners’ production issues word-by-word can facilitate timely and pedagogically driven responses from an automated system. Existing research on disfluency detection in learner speech focuses on disfluency removal for subsequent downstream tasks, processing whole utterances non-incrementally. This paper instead explores the application of laughter as a feature for incremental disfluency detection and shows that when combined with silence, these features reduce the impact of learner errors on model precision as well as lead to an overall improvement of model performance. This work adds to the growing body of research incorporating laughter as a feature for dialogue processing tasks and provides further support for the application of multimodality in dialogue-based CALL systems.