NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors

Numaan Naeem; Sarfraz Ahmad; Momina Ahsan; Hasan Iqbal

NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors

Numaan Naeem, Sarfraz Ahmad, Momina Ahsan, Hasan Iqbal

Abstract

This paper presents our system for Track 1: Mistake Identification in the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors. The task involves evaluating whether a tutor’s response correctly identifies a mistake in a student’s mathematical reasoning. We explore four approaches: (1) an ensemble of machine learning models over pooled token embeddings from multiple pretrained langauge models (LMs); (2) a frozen sentence-transformer using [CLS] embeddings with an MLP classifier; (3) a history-aware model with multi-head attention between token-level history and response embeddings; and (4) a retrieval-augmented few-shot prompting system with a large language model (LLM) i.e. GPT 4o. Our final system retrieves semantically similar examples, constructs structured prompts, and uses schema-guided output parsing to produce interpretable predictions. It outperforms all baselines, demonstrating the effectiveness of combining example-driven prompting with LLM reasoning for pedagogical feedback assessment.

Anthology ID:: 2025.bea-1.100
Volume:: Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Ekaterina Kochmar, Bashar Alhafni, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
Venues:: BEA | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1254–1259
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.bea-1.100/
DOI:
Bibkey:
Cite (ACL):: Numaan Naeem, Sarfraz Ahmad, Momina Ahsan, and Hasan Iqbal. 2025. NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors. In Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025), pages 1254–1259, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors (Naeem et al., BEA 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.bea-1.100.pdf

PDF Cite Search Fix data