FastGAS: Fast Graph-based Annotation Selection for In-Context Learning

Zihan Chen, Song Wang, Cong Shen, Jundong Li


Abstract
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts. Since generating the prompts needs to sample from a vast pool of instances and annotate them (e.g., add labels in classification task), existing methods have proposed to select a subset of unlabeled examples for annotation, thus enhancing the quality of prompts and concurrently mitigating annotation costs. However, these methods often require a long time to select instances due to their complexity, hindering their practical viability. To address this limitation, we propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances while minimizing computational overhead. Initially, we construct a data similarity graph based on instance similarities. Subsequently, employing a graph partitioning algorithm, we partition the graph into pieces. Within each piece (i.e., subgraph), we adopt a greedy approach to pick the most representative nodes. By aggregating nodes from diverse pieces and annotating the corresponding instances, we identify a set of diverse and representative instances for ICL. Compared to prior approaches, our method not only exhibits superior performance on different tasks but also significantly reduces selection time. In addition, we demonstrate the efficacy of our approach in LLMs of larger sizes.
Anthology ID:
2024.findings-acl.581
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9764–9780
Language:
URL:
https://aclanthology.org/2024.findings-acl.581
DOI:
Bibkey:
Cite (ACL):
Zihan Chen, Song Wang, Cong Shen, and Jundong Li. 2024. FastGAS: Fast Graph-based Annotation Selection for In-Context Learning. In Findings of the Association for Computational Linguistics ACL 2024, pages 9764–9780, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
FastGAS: Fast Graph-based Annotation Selection for In-Context Learning (Chen et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.581.pdf