Mohammad Rashedul Hasan
2025
A Three-Tier LLM Framework for Forecasting Student Engagement from Qualitative Longitudinal Data
Ahatsham Hayat
|
Helen Martinez
|
Bilal Khan
|
Mohammad Rashedul Hasan
Proceedings of the 29th Conference on Computational Natural Language Learning
Forecasting nuanced shifts in student engagement from longitudinal experiential (LE) data—multi-modal, qualitative trajectories of academic experiences over time—remains challenging due to high dimensionality and missingness. We propose a natural language processing (NLP)-driven framework using large language models (LLMs) to forecast binary engagement levels across four dimensions: Lecture Engagement Disposition, Academic Self-Efficacy, Performance Self-Evaluation, and Academic Identity and Value Perception. Evaluated on 960 trajectories from 96 first-year STEM students, our three-tier approach—LLM-informed imputation to generate textual descriptors for missing-not-at-random (MNAR) patterns, zero-shot feature selection via ensemble voting, and fine-tuned LLMs—processes textual non-cognitive responses. LLMs substantially outperform numeric baselines (e.g., Random Forest, LSTM) by capturing contextual nuances in student responses. Encoder-only LLMs surpass decoder-only variants, highlighting architectural strengths for sparse, qualitative LE data. Our framework advances NLP solutions for modeling student engagement from complex LE data, excelling where traditional methods struggle.
ConText-LE: Cross-Distribution Generalization for Longitudinal Experiential Data via Narrative-Based LLM Representations
Ahatsham Hayat
|
Bilal Khan
|
Mohammad Rashedul Hasan
Findings of the Association for Computational Linguistics: EMNLP 2025
Longitudinal experiential data offers rich insights into dynamic human states, yet building models that generalize across diverse contexts remains challenging. We propose ConText-LE, a framework that systematically investigates text representation strategies and output formulations to maximize large language model cross-distribution generalization for behavioral forecasting. Our novel Meta-Narrative representation synthesizes complex temporal patterns into semantically rich narratives, while Prospective Narrative Generation reframes prediction as a generative task aligned with LLMs’ contextual understanding capabilities. Through comprehensive experiments on three diverse longitudinal datasets addressing the underexplored challenge of cross-distribution generalization in mental health and educational forecasting, we show that combining Meta-Narrative input with Prospective Narrative Generation significantly outperforms existing approaches. Our method achieves up to 12.28% improvement in out-of-distribution accuracy and up to 11.99% improvement in F1 scores over binary classification methods. Bidirectional evaluation and architectural ablation studies confirm the robustness of our approach, establishing ConText-LE as an effective framework for reliable behavioral forecasting across temporal and contextual shifts.