Logits-Based Block Pruning with Affine Transformations for Large Language Models

Zekun Hu, Yichu Xu, De-Chuan Zhan


Abstract
As the scale of Large Language Models (LLMs) continues to grow rapidly, the cost of training and inference has significantly increased, limiting their application in resource-constrained scenarios. To address this challenge, model pruning has been widely used to reduce computational complexity. Among various pruning strategies, block-wise pruning has gained popularity due to its ability to accelerate computation by removing entire blocks of parameters. However, existing methods often rely on hard labels from calibration datasets and neglect the cumulative effects of pruning on subsequent blocks. To address this, we propose two complementary techniques: the Logit Disruption Score (LDS), a novel block importance criterion that measures the impact of pruning by comparing the cosine similarity between the logits of the original and pruned models, focusing on the most informative logit dimensions to better preserve the model’s core capabilities; and Activation Statistics Correction (ASC), an affine transformation mechanism that aligns the mean and variance of activations in the pruned model with those of the original model, effectively mitigating the distribution shift caused by block removal and improving the information flow in subsequent blocks. Experiments across multiple datasets show that our approach reduces reliance on calibration data and improves generalization, achieving competitive results with existing methods.
Anthology ID:
2026.findings-eacl.193
Volume:
Findings of the Association for Computational Linguistics: EACL 2026
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3722–3736
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.193/
DOI:
Bibkey:
Cite (ACL):
Zekun Hu, Yichu Xu, and De-Chuan Zhan. 2026. Logits-Based Block Pruning with Affine Transformations for Large Language Models. In Findings of the Association for Computational Linguistics: EACL 2026, pages 3722–3736, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Logits-Based Block Pruning with Affine Transformations for Large Language Models (Hu et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.193.pdf
Checklist:
 2026.findings-eacl.193.checklist.pdf