On the Versatility of Sparse Autoencoders for In-Context Learning

Ikhyun Cho; Gaeul Kwon; Julia Hockenmaier

doi:10.18653/v1/2025.findings-emnlp.1063

On the Versatility of Sparse Autoencoders for In-Context Learning

Ikhyun Cho, Gaeul Kwon, Julia Hockenmaier

Abstract

Sparse autoencoders (SAEs) are emerging as a key analytical tool in the field of mechanistic interpretability for large language models (LLMs). While SAEs have primarily been used for interpretability, we shift focus and explore an understudied question: “Can SAEs be applied to practical tasks beyond interpretability?” Given that SAEs are trained on billions of tokens for sparse reconstruction, we believe they can serve as effective extractors, offering a wide range of useful knowledge that can benefit practical applications. Building on this motivation, we demonstrate that SAEs can be effectively applied to in-context learning (ICL). In particular, we highlight the utility of the SAE-reconstruction loss by showing that it provides a valuable signal in ICL—exhibiting a strong correlation with LLM performance and offering a powerful unsupervised approach for prompt selection. These findings underscore the versatility of SAEs and reveal their potential for real-world applications beyond interpretability. Our code is available at https://github.com/ihcho2/SAE-GPS.

Anthology ID:: 2025.findings-emnlp.1063
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19531–19538
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1063/
DOI:: 10.18653/v1/2025.findings-emnlp.1063
Bibkey:
Cite (ACL):: Ikhyun Cho, Gaeul Kwon, and Julia Hockenmaier. 2025. On the Versatility of Sparse Autoencoders for In-Context Learning. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 19531–19538, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: On the Versatility of Sparse Autoencoders for In-Context Learning (Cho et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1063.pdf
Checklist:: 2025.findings-emnlp.1063.checklist.pdf

PDF Cite Search Checklist Fix data