2025
pdf
bib
abs
Toward Efficient Sparse Autoencoder-Guided Steering for Improved In-Context Learning in Large Language Models
Ikhyun Cho
|
Julia Hockenmaier
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Sparse autoencoders (SAEs) have emerged as a powerful analytical tool in mechanistic interpretability for large language models (LLMs), with growing success in applications beyond interpretability. Building on this momentum, we present a novel approach that leverages SAEs to enhance the general in-context learning (ICL) performance of LLMs.Specifically, we introduce Feature Detection through Prompt Variation (FDPV), which leverages the SAE’s remarkable ability to capture subtle differences between prompts, enabling efficient feature selection for downstream steering. In addition, we propose a novel steering method tailored to ICL—Selective In-Context Steering (SISTER)—grounded in recent insights from ICL research that LLMs utilize label words as key anchors. Our method yields a 3.5% average performance improvement across diverse text classification tasks and exhibits greater robustness to hyperparameter variations compared to standard steering approaches. Our code is available at https://github.com/ihcho2/SAE-ICL.
pdf
bib
abs
The Power of Bullet Lists: A Simple Yet Effective Prompting Approach to Enhancing Spatial Reasoning in Large Language Models
Ikhyun Cho
|
Changyeon Park
|
Julia Hockenmaier
Findings of the Association for Computational Linguistics: NAACL 2025
While large language models (LLMs) are dominating the field of natural language processing, it remains an open question how well these models can perform spatial reasoning. Contrary to recent studies suggesting that LLMs struggle with spatial reasoning tasks, we demonstrate in this paper that a novel prompting technique, termed Patient Visualization of Thought (Patient-VoT), can boost LLMs’ spatial reasoning abilities. The core idea behind Patient-VoT is to explicitly integrate *bullet lists, coordinates, and visualizations* into the reasoning process. By applying Patient-VoT, we achieve a significant boost in spatial reasoning performance compared to prior prompting techniques. We also show that integrating bullet lists into reasoning is effective in planning tasks, highlighting its general effectiveness across different applications.
pdf
bib
abs
On the Versatility of Sparse Autoencoders for In-Context Learning
Ikhyun Cho
|
Gaeul Kwon
|
Julia Hockenmaier
Findings of the Association for Computational Linguistics: EMNLP 2025
Sparse autoencoders (SAEs) are emerging as a key analytical tool in the field of mechanistic interpretability for large language models (LLMs). While SAEs have primarily been used for interpretability, we shift focus and explore an understudied question: “Can SAEs be applied to practical tasks beyond interpretability?” Given that SAEs are trained on billions of tokens for sparse reconstruction, we believe they can serve as effective extractors, offering a wide range of useful knowledge that can benefit practical applications. Building on this motivation, we demonstrate that SAEs can be effectively applied to in-context learning (ICL). In particular, we highlight the utility of the SAE-reconstruction loss by showing that it provides a valuable signal in ICL—exhibiting a strong correlation with LLM performance and offering a powerful unsupervised approach for prompt selection. These findings underscore the versatility of SAEs and reveal their potential for real-world applications beyond interpretability. Our code is available at https://github.com/ihcho2/SAE-GPS.
2024
pdf
bib
abs
Tutor-ICL: Guiding Large Language Models for Improved In-Context Learning Performance
Ikhyun Cho
|
Gaeul Kwon
|
Julia Hockenmaier
Findings of the Association for Computational Linguistics: EMNLP 2024
There has been a growing body of work focusing on the in-context learning (ICL) abilities of large language models (LLMs). However, it is an open question how effective ICL can be. This paper presents Tutor-ICL, a simple prompting method for classification tasks inspired by how effective instructors might engage their students in learning a task. Specifically, we propose presenting exemplar answers in a *comparative format* rather than the traditional single-answer format. We also show that including the test instance before the exemplars can improve performance, making it easier for LLMs to focus on relevant exemplars. Lastly, we include a summarization step before attempting the test, following a common human practice. Experiments on various classification tasks, conducted across both decoder-only LLMs (Llama 2, 3) and encoder-decoder LLMs (Flan-T5-XL, XXL), show that Tutor-ICL consistently boosts performance, achieving up to a 13.76% increase in accuracy.
2023
pdf
bib
abs
SIR-ABSC: Incorporating Syntax into RoBERTa-based Sentiment Analysis Models with a Special Aggregator Token
Ikhyun Cho
|
Yoonhwa Jung
|
Julia Hockenmaier
Findings of the Association for Computational Linguistics: EMNLP 2023
We present a simple, but effective method to incorporate syntactic dependency information directly into transformer-based language models (e.g. RoBERTa) for tasks such as Aspect-Based Sentiment Classification (ABSC), where the desired output depends on specific input tokens. In contrast to prior approaches to ABSC that capture syntax by combining language models with graph neural networks over dependency trees, our model, Syntax-Integrated RoBERTa for ABSC (SIR-ABSC) incorporates syntax directly into the language model by using a novel aggregator token. Yet, SIR-ABSC outperforms these more complex models, yielding new state-of-the-art results on ABSC.