SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression

Xin Wang; Samiul Alam; Zhongwei Wan; Hui Shen; Mi Zhang

SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression

Xin Wang, Samiul Alam, Zhongwei Wan, Hui Shen, Mi Zhang

Abstract

Despite significant advancements, the practical deployment of Large Language Models (LLMs) is often hampered by their immense sizes, highlighting the need for effective compression techniques. Singular Value Decomposition (SVD) emerges as a promising method for compressing LLMs. However, existing SVD-based compression approaches suffer from substantial truncation losses, leading to severe performance degradation in compressed models. In this work, we introduce , a novel SVD-based LLM compression method that optimizes singular value truncation in SVD compression with two key strategies. First, employs dynamic compression ratio allocation to effectively balance the extremely large truncation loss across different layers. Second, it implements loss-optimized weight truncation to ensure that the truncated singular values result in a lower and more stable truncation loss in practice. We evaluate on ten datasets and five models on various scales and demonstrated that outperforms current state-of-the-art methods. The source code is available at https://github.com/AIoT-MLSys-Lab/SVD-LLM.

Anthology ID:: 2025.naacl-long.217
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4287–4296
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.217/
DOI:
Bibkey:
Cite (ACL):: Xin Wang, Samiul Alam, Zhongwei Wan, Hui Shen, and Mi Zhang. 2025. SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4287–4296, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression (Wang et al., NAACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.217.pdf

PDF Cite Search Fix data