Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study

Peiyu Liu; Zikang Liu; Ze-Feng Gao; Dawei Gao; Wayne Xin Zhao; Yaliang Li; Bolin Ding; Ji-Rong Wen

Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study

Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang Li, Bolin Ding, Ji-Rong Wen

Abstract

Despite the superior performance, Large Language Models (LLMs) require significant computational resources for deployment and use. To overcome this issue, quantization methods have been widely applied to reduce the memory footprint of LLMs as well as increase the inference rate. However, a major challenge is that low-bit quantization methods often lead to performance degradation. It is important to understand how quantization impacts the capacity of LLMs. Different from previous studies focused on overall performance, this work aims to investigate the impact of quantization on emergent abilities, which are important characteristics that distinguish LLMs from small language models. Specifically, we examine the abilities of in-context learning, chain-of-thought reasoning, and instruction-following in quantized LLMs. Our empirical experiments show that these emergent abilities still exist in 4-bit quantization models, while 2-bit models encounter severe performance degradation on the test of these abilities. To improve the performance of low-bit models, we conduct two special experiments: (1) fine-gained impact analysis that studies which components (or substructures) are more sensitive to quantization, and (2) performance compensation through model fine-tuning. Our work derives a series of important findings to understand the impact of quantization on emergent abilities and sheds light on the possibilities of extremely low-bit quantization for LLMs.

Anthology ID:: 2024.lrec-main.461
Volume:: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:: May
Year:: 2024
Address:: Torino, Italia
Editors:: Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:: LREC | COLING
SIG:
Publisher:: ELRA and ICCL
Note:
Pages:: 5174–5190
Language:
URL:: https://aclanthology.org/2024.lrec-main.461
DOI:
Bibkey:
Cite (ACL):: Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao, Yaliang Li, Bolin Ding, and Ji-Rong Wen. 2024. Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 5174–5190, Torino, Italia. ELRA and ICCL.
Cite (Informal):: Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study (Liu et al., LREC-COLING 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/add_acl24_videos/2024.lrec-main.461.pdf

PDF Search