In-Context Learning on a Budget: A Case Study in Token Classification

Uri Berger, Tal Baumel, Gabriel Stanovsky


Abstract
Few shot in-context learning (ICL) typically assumes access to large annotated training sets. However, in many real world scenarios, such as domain adaptation, there is only a limited budget to annotate a small number of samples, with the goal of maximizing downstream performance. We study various methods for selecting samples to annotate within a predefined budget, focusing on token classification tasks, which are expensive to annotate and are relatively less studied in ICL setups. Across various tasks, models, and datasets, we observe that no method significantly outperforms the others, with most yielding similar results, including random sample selection for annotation. Moreover, we demonstrate that a relatively small annotated sample pool can achieve performance comparable to using the entire training set. We hope that future work adopts our realistic paradigm which takes annotation budget into account.
Anthology ID:
2025.insights-1.2
Volume:
The Sixth Workshop on Insights from Negative Results in NLP
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Aleksandr Drozd, João Sedoc, Shabnam Tafreshi, Arjun Akula, Raphael Shu
Venues:
insights | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7–14
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.insights-1.2/
DOI:
Bibkey:
Cite (ACL):
Uri Berger, Tal Baumel, and Gabriel Stanovsky. 2025. In-Context Learning on a Budget: A Case Study in Token Classification. In The Sixth Workshop on Insights from Negative Results in NLP, pages 7–14, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
In-Context Learning on a Budget: A Case Study in Token Classification (Berger et al., insights 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.insights-1.2.pdf