Abstract
Delving into pruning techniques is essential to boost the efficiency of Large Language Models (LLMs) by reducing their size and computational demands, resulting in faster and more cost-effective inference. In this work, our key contribution lies in recognizing that LLMs trained on diverse languages manifest distinct language-specific weight distributions. Exploiting this insight, we illustrate that pruning LLMs using language-specific data results in a more potent model compression. Empirical evidence underscores the critical nature of pruning on language-specific data, highlighting a noteworthy impact on the perplexity of Ukrainian texts compared to pruning on English data. The proposed methodology significantly reduces the size of LLaMA, LLaMA 2 and Mistral models while preserving competitive performance. This research underscores the significance of linguistic considerations in LLM pruning and advocates for language-specific optimization, establishing a framework for more efficient and tailored language models across diverse linguistic contexts. Additionally, all experiments were conducted using a single consumer-grade NVIDIA RTX 3090 GPU, and the code is available at https://github.com/mshamrai/language-specific-pruning.- Anthology ID:
- 2024.unlp-1.16
- Volume:
- Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Mariana Romanyshyn, Nataliia Romanyshyn, Andrii Hlybovets, Oleksii Ignatenko
- Venue:
- UNLP
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 135–140
- Language:
- URL:
- https://aclanthology.org/2024.unlp-1.16
- DOI:
- Cite (ACL):
- Maksym Shamrai. 2024. Language-Specific Pruning for Efficient Reduction of Large Language Models. In Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024, pages 135–140, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Language-Specific Pruning for Efficient Reduction of Large Language Models (Shamrai, UNLP 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.unlp-1.16.pdf