FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference

Yu-Chen Lu, Chong-Yan Chen, Chi-Chih Chang, Yu-Fang Hu, Kai-Chiang Wu


Abstract
Although large language models (LLM) have achieved remarkable performance, their enormous parameter counts hinder deployment on resource-constrained hardware. Low-rank compression can reduce both memory usage and computational demand, but applying a uniform compression ratio across all layers often leads to significant performance degradation, and previous methods perform poorly during decoding. To address these issues, we propose the Fine-grained Low-Rank Compressor (FLRC), which efficiently determines an optimal rank allocation for each layer, and incorporates progressive low-rank decoding to maintain text generation quality. Comprehensive experiments on diverse benchmarks demonstrate the superiority of FLRC, achieving up to a 17% improvement in ROUGE-L on summarization tasks compared to state-of-the-art low-rank compression methods, establishing a more robust and efficient framework to improve LLM inference.
Anthology ID:
2025.emnlp-main.755
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14956–14966
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.emnlp-main.755/
DOI:
10.18653/v1/2025.emnlp-main.755
Bibkey:
Cite (ACL):
Yu-Chen Lu, Chong-Yan Chen, Chi-Chih Chang, Yu-Fang Hu, and Kai-Chiang Wu. 2025. FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 14956–14966, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference (Lu et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.emnlp-main.755.pdf
Checklist:
 2025.emnlp-main.755.checklist.pdf