Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts

Zeliang Zhang; Xiaodong Liu; Hao Cheng; Chenliang Xu; Jianfeng Gao

Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts

Zeliang Zhang, Xiaodong Liu, Hao Cheng, Chenliang Xu, Jianfeng Gao

Abstract

In this work, we address the memory overhead of deploying Mixture-of-Experts (MoE) architectures in Large Language Models (LLMs). While MoE layers improve LLM performance without increasing inference costs, the ever-growing number of experts inflates memory requirements, hindering practical deployment. Our empirical study reveals that some experts encode redundant knowledge during pre-training. We thus propose a method of grouping and pruning similar experts to improve the model’s parameter efficiency. We validate the effectiveness of our method by pruning three state-of-the-art MoE architectures, including Mixtral, Deepseek-MoE, and Qwen. The evaluation shows that our method outperforms other model pruning methods on a range of natural language tasks.

Anthology ID:: 2025.findings-acl.4
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:: Findings | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 86–102
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.4/
DOI:
Bibkey:
Cite (ACL):: Zeliang Zhang, Xiaodong Liu, Hao Cheng, Chenliang Xu, and Jianfeng Gao. 2025. Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts. In Findings of the Association for Computational Linguistics: ACL 2025, pages 86–102, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts (Zhang et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.4.pdf

PDF Cite Search Fix data