Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments

Ibne Farabi Shihab, Sanjeda Akter, Anuj Sharma


Abstract
As the deployment of AI models shifts towards edge devices, developing efficient sequence models has become critical. State-space models (SSMs), particularly Mamba, have emerged as strong rivals to Transformers due to their linear-time complexity and impressive performance across a range of tasks. However, their large parameter counts still hinder their use in resource-constrained environments. To address this, we propose a novel unstructured pruning framework specifically tailored for Mamba, achieving up to 70% parameter reduction with only a 3–9% drop in performance. Unlike pruning techniques designed for Transformers, our approach leverages Mamba’s unique recurrent dynamics by incorporating pruning based on both weight and gradient importance to preserve critical parameters, a gradual pruning schedule to maintain model stability, and a global strategy to optimize parameter allocation across the model. Extensive experiments on the WikiText-103, Long Range Arena, and ETT benchmarks demonstrate significant efficiency gains, including 1.77× faster inference and a 46% reduction in memory usage. Our component analysis confirms Mamba’s robustness to pruning, highlighting the framework’s potential for enabling practical deployment while underscoring the need for careful evaluation to avoid introducing biases in sensitive applications.
Anthology ID:
2025.emnlp-main.562
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11109–11137
Language:
URL:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.emnlp-main.562/
DOI:
10.18653/v1/2025.emnlp-main.562
Bibkey:
Cite (ACL):
Ibne Farabi Shihab, Sanjeda Akter, and Anuj Sharma. 2025. Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 11109–11137, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments (Shihab et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.emnlp-main.562.pdf
Checklist:
 2025.emnlp-main.562.checklist.pdf