FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
Yu-Chen Lu, Chong-Yan Chen, Chi-Chih Chang, Yu-Fang Hu, Kai-Chiang Wu
Abstract
Although large language models (LLM) have achieved remarkable performance, their enormous parameter counts hinder deployment on resource-constrained hardware. Low-rank compression can reduce both memory usage and computational demand, but applying a uniform compression ratio across all layers often leads to significant performance degradation, and previous methods perform poorly during decoding. To address these issues, we propose the Fine-grained Low-Rank Compressor (FLRC), which efficiently determines an optimal rank allocation for each layer, and incorporates progressive low-rank decoding to maintain text generation quality. Comprehensive experiments on diverse benchmarks demonstrate the superiority of FLRC, achieving up to a 17% improvement in ROUGE-L on summarization tasks compared to state-of-the-art low-rank compression methods, establishing a more robust and efficient framework to improve LLM inference.- Anthology ID:
- 2025.emnlp-main.755
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 14956–14966
- Language:
- URL:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.emnlp-main.755/
- DOI:
- 10.18653/v1/2025.emnlp-main.755
- Cite (ACL):
- Yu-Chen Lu, Chong-Yan Chen, Chi-Chih Chang, Yu-Fang Hu, and Kai-Chiang Wu. 2025. FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 14956–14966, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference (Lu et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.emnlp-main.755.pdf