Rethinking the Evaluation of In-Context Learning for LLMs

Guoxin Yu; Lemao Liu; Mo Yu; Yue Yu; Xiang Ao

doi:10.18653/v1/2024.emnlp-main.779

Rethinking the Evaluation of In-Context Learning for LLMs

Guoxin Yu, Lemao Liu, Mo Yu, Yue Yu, Xiang Ao

Abstract

In-context learning (ICL) has demonstrated excellent performance across various downstream NLP tasks, especially when synergized with powerful large language models (LLMs). Existing studies evaluate ICL methods primarily based on downstream task performance. This evaluation protocol overlooks the significant cost associated with the demonstration configuration process, i.e., tuning the demonstration as the ICL prompt. However, in this work, we point out that the evaluation protocol leads to unfair comparisons and potentially biased evaluation, because we surprisingly find the correlation between the configuration costs and task performance. Then we call for a two-dimensional evaluation paradigm that considers both of these aspects, facilitating a fairer comparison.Finally, based on our empirical finding that the optimized demonstration on one language model generalizes across language models of different sizes, we introduce a simple yet efficient strategy that can be applied to any ICL method as a plugin, yielding a better trade-off between the two dimensions according to the proposed evaluation paradigm.

Anthology ID:: 2024.emnlp-main.779
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14068–14082
Language:
URL:: https://preview.aclanthology.org/add_missing_videos/2024.emnlp-main.779/
DOI:: 10.18653/v1/2024.emnlp-main.779
Bibkey:
Cite (ACL):: Guoxin Yu, Lemao Liu, Mo Yu, Yue Yu, and Xiang Ao. 2024. Rethinking the Evaluation of In-Context Learning for LLMs. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 14068–14082, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Rethinking the Evaluation of In-Context Learning for LLMs (Yu et al., EMNLP 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/add_missing_videos/2024.emnlp-main.779.pdf

PDF Search Fix data