Corrective In-Context Learning: Evaluating Self-Correction in Large Language Models

Mario Sanz-Guerrero, Katharina Von Der Wense


Abstract
In-context learning (ICL) has transformed the use of large language models (LLMs) for NLP tasks, enabling few-shot learning by conditioning on labeled examples without finetuning. Despite its effectiveness, ICL is prone to errors, especially for challenging examples. With the goal of improving the performance of ICL, we propose *corrective in-context learning* (CICL), an approach that incorporates a model’s incorrect predictions alongside ground truth corrections into the prompt, aiming to enhance classification accuracy through self-correction. However, contrary to our hypothesis, extensive experiments on text classification tasks demonstrate that CICL consistently underperforms standard ICL, with performance degrading as the proportion of corrections in the prompt increases. Our findings indicate that CICL introduces confusion by disrupting the model’s task understanding, rather than refining its predictions. Additionally, we observe that presenting harder examples in standard ICL does not improve performance, suggesting that example difficulty alone may not be a reliable criterion for effective selection. By presenting these negative results, we provide important insights into the limitations of self-corrective mechanisms in LLMs and offer directions for future research.
Anthology ID:
2025.insights-1.4
Volume:
The Sixth Workshop on Insights from Negative Results in NLP
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Aleksandr Drozd, João Sedoc, Shabnam Tafreshi, Arjun Akula, Raphael Shu
Venues:
insights | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–33
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.insights-1.4/
DOI:
Bibkey:
Cite (ACL):
Mario Sanz-Guerrero and Katharina Von Der Wense. 2025. Corrective In-Context Learning: Evaluating Self-Correction in Large Language Models. In The Sixth Workshop on Insights from Negative Results in NLP, pages 24–33, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Corrective In-Context Learning: Evaluating Self-Correction in Large Language Models (Sanz-Guerrero & Von Der Wense, insights 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.insights-1.4.pdf