@inproceedings{stahlberg-kumar-2025-role,
    title = "The Role of Outgoing Connection Heterogeneity in Feedforward Layers of Large Language Models",
    author = "Stahlberg, Felix  and
      Kumar, Shankar",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1143/",
    pages = "22487--22495",
    ISBN = "979-8-89176-332-6",
    abstract = "We report on investigations into the characteristics of outgoing connections in feedforward layers of large language models. Our findings show that inner neurons with diverse outgoing connection strengths are more critical to model performance than those with uniform connections. We propose a new fine-tuning loss that takes advantage of this observation by decreasing the outgoing connection entropy in feedforward layers. Using this loss yields gains over standard fine-tuning across two different model families (PaLM-2 and Gemma-2) for downstream tasks in math, coding, and language understanding. To further elucidate the role of outgoing connection heterogeneity, we develop a data-free structured pruning method, which uses entropy to identify and remove neurons. This method is considerably more effective than removing neurons either randomly or based on their magnitude."
}Markdown (Informal)
[The Role of Outgoing Connection Heterogeneity in Feedforward Layers of Large Language Models](https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1143/) (Stahlberg & Kumar, EMNLP 2025)
ACL