@inproceedings{naeem-etal-2025-neuralnexus,
    title = "{N}eural{N}exus at {BEA} 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in {AI} Tutors",
    author = "Naeem, Numaan  and
      Ahmad, Sarfraz  and
      Ahsan, Momina  and
      Iqbal, Hasan",
    editor = {Kochmar, Ekaterina  and
      Alhafni, Bashar  and
      Bexte, Marie  and
      Burstein, Jill  and
      Horbach, Andrea  and
      Laarmann-Quante, Ronja  and
      Tack, Ana{\"i}s  and
      Yaneva, Victoria  and
      Yuan, Zheng},
    booktitle = "Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.bea-1.100/",
    doi = "10.18653/v1/2025.bea-1.100",
    pages = "1254--1259",
    ISBN = "979-8-89176-270-1",
    abstract = "This paper presents our system for Track 1: Mistake Identification in the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors. The task involves evaluating whether a tutor{'}s response correctly identifies a mistake in a student{'}s mathematical reasoning. We explore four approaches: (1) an ensemble of machine learning models over pooled token embeddings from multiple pretrained langauge models (LMs); (2) a frozen sentence-transformer using [CLS] embeddings with an MLP classifier; (3) a history-aware model with multi-head attention between token-level history and response embeddings; and (4) a retrieval-augmented few-shot prompting system with a large language model (LLM) i.e. GPT 4o. Our final system retrieves semantically similar examples, constructs structured prompts, and uses schema-guided output parsing to produce interpretable predictions. It outperforms all baselines, demonstrating the effectiveness of combining example-driven prompting with LLM reasoning for pedagogical feedback assessment."
}Markdown (Informal)
[NeuralNexus at BEA 2025 Shared Task: Retrieval-Augmented Prompting for Mistake Identification in AI Tutors](https://preview.aclanthology.org/ingest-emnlp/2025.bea-1.100/) (Naeem et al., BEA 2025)
ACL