Luisa Ribeiro-Flucht

2026

From Dialogue to Learner Modeling: Identifying Candidate Signals of Productive Use in LLM-Based Grammar Practice
Luisa Ribeiro-Flucht | Lanhua Huang | Xiaobin Chen
Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)

Adaptive language-learning systems often model progress through correctness in constrained exercises, where the target response is predefined. In dialogue-based tutors, by contrast, learners can respond appropriately in many ways, making evidence of progress harder to interpret. This raises a learner-modeling problem: determining whether learner production provides useful evidence of progress, which aspects are informative, and how they might support adaptation. We address this problem using pilot data from an LLM-based English grammar tutor, comprising 40 pre- and post-test tasks, treatment interactions, and 2,406 learner messages. We propose a coding scheme for learner production in dialogue and explore whether the resulting evidence types can support future adaptive decisions. Findings show that learner production in dialogue can support adaptive grammar practice: prior target use predicted short-term performance, while finer-grained evidence helped distinguish different levels of productive control. We discuss implications for adaptive grammar-based dialogue systems that use learner production to support communicative practice.

2025

pdf bib abs

A Framework for Proficiency-Aligned Grammar Practice in LLM-Based Dialogue Systems
Luisa Ribeiro-Flucht | Xiaobin Chen | Detmar Meurers
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

Communicative practice is critical for second language development, yet learners often lack targeted, engaging opportunities to use new grammar structures. While large language models (LLMs) can offer coherent interactions, they are not inherently aligned with pedagogical goals or proficiency levels. In this paper, we explore how LLMs can be integrated into a structured framework for contextually-constrained, grammar-focused interaction, building on an existing goal-oriented dialogue system. Through controlled simulations, we evaluate five LLMs across 75 A2-level tasks under two conditions: (i) grammar-targeted, task-anchored prompting and (ii) the addition of a lightweight post-generation validation pipeline using a grammar annotator.Our findings show that template-based prompting alone substantially increases target-form coverage up to 91.4% for LLaMA 3.1-70B-Instruct, while reducing overly advanced grammar usage. The validation pipeline provides an additional boost in form-focused tasks, raising coverage to 96.3% without significantly degrading appropriateness.

2024

pdf bib

Developing a Web-Based Intelligent Language Assessment Platform Powered by Natural Language Processing Technologies
Sarah Löber | Björn Rudzewitz | Daniela Verratti Souto | Luisa Ribeiro-Flucht | Xiaobin Chen
Proceedings of the 13th Workshop on Natural Language Processing for Computer Assisted Language Learning

pdf bib abs

Explainable AI in Language Learning: Linking Empirical Evidence and Theoretical Concepts in Proficiency and Readability Modeling of Portuguese
Luisa Ribeiro-Flucht | Xiaobin Chen | Detmar Meurers
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

While machine learning methods have supported significantly improved results in education research, a common deficiency lies in the explainability of the result. Explainable AI (XAI) aims to fill that gap by providing transparent, conceptually understandable explanations for the classification decisions, enhancing human comprehension and trust in the outcomes. This paper explores an XAI approach to proficiency and readability assessment employing a comprehensive set of 465 linguistic complexity measures. We identify theoretical descriptions associating such measures with varying levels of proficiency and readability and validate them using cross-corpus experiments employing supervised machine learning and Shapley Additive Explanations. The results not only highlight the utility of a diverse set of complexity measures in effectively modeling proficiency and readability in Portuguese, achieving a state-of-the-art accuracy of 0.70 in the proficiency classification task and of 0.84 in the readability classification task, but they largely corroborate the theoretical research assumptions, especially in the lexical domain.

Co-authors

Daniela Verratti Souto 1

Venues

Fix author