When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Ting-Yun Chang; Jesse Thomason; Robin Jia

doi:10.18653/v1/2024.emnlp-main.574

When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Ting-Yun Chang, Jesse Thomason, Robin Jia

Abstract

This paper studies in-context learning by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that always predict the same label. We find that component accuracies are well-correlated across different demonstration sets and perturbations of prompt templates. Based on our findings, we propose component reweighting, which learns to linearly re-scale the component activations from a few labeled examples. Given 24 labeled examples, our method improves by an average of 6.0% accuracy points over 24-shot ICL across 8 tasks on Llama-2-7B. Overall, this paper both enriches our understanding of ICL and provides a practical method for improvement by examining model internals.

Anthology ID:: 2024.emnlp-main.574
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10280–10299
Language:
URL:: https://preview.aclanthology.org/add_missing_videos/2024.emnlp-main.574/
DOI:: 10.18653/v1/2024.emnlp-main.574
Bibkey:
Cite (ACL):: Ting-Yun Chang, Jesse Thomason, and Robin Jia. 2024. When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 10280–10299, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models (Chang et al., EMNLP 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/add_missing_videos/2024.emnlp-main.574.pdf
Software:: 2024.emnlp-main.574.software.zip

PDF Search Software Fix data