Wangyun Gu

2025

pdf bib abs
NeuronMerge: Merging Models via Functional Neuron Groups
Wangyun Gu | Qianghua Gao | Zhang Li-Xin | Xu Shen | Jieping Ye
Findings of the Association for Computational Linguistics: ACL 2025

Model merging techniques like task arithmetic, which combines model parameters through weighted averaging, have proven effective. However, the success of task arithmetic relies on the linearity between model weight differences and output feature changes, which is often lacking in conventional fine-tuned models. In this work, we employ neuron description methods to analyze and classify neurons based on their functionalities. We theoretically demonstrate that grouping Multi-Layer Perceptron (MLP) neurons by functionality enhances model linearity. Building on this, we propose a neuron-based task arithmetic merging method that consistently improves performance across various tasks and model scales. Our approach is complementary to existing merging techniques, achieving superior results in merging models fine-tuned on fundamental tasks like Math, Code and Translation.

Co-authors

Venues

findings1
ws1

Fix author