Zunhai Su
2026
Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models
He Xiao | Qingyao Yang | Dirui Xie | Wendong XU | Zunhai Su | Runming Yang | Haobo Liu | Wenyong Zhou | Zhengwu Liu | Ngai Wong
Findings of the Association for Computational Linguistics: ACL 2026
He Xiao | Qingyao Yang | Dirui Xie | Wendong XU | Zunhai Su | Runming Yang | Haobo Liu | Wenyong Zhou | Zhengwu Liu | Ngai Wong
Findings of the Association for Computational Linguistics: ACL 2026
Large language models with billions of parameters are often over-provisioned: many layers contribute little unique information yet dominate the memory and energy footprint during inference. We present LieQ (Layer-wise information effectiveness Quantization), a hardware-native, metric-driven post-training quantization framework that addresses the critical challenge of maintaining accuracy in sub-8B models, model parameters less than 8B, under extreme low-bit compression. LieQ keeps uniform bit-width within each layer while mixing precision across layers, preserving standard multiplication kernels and avoiding irregular memory access, codebooks, or irregular formats at inference time. Our method uncovers a strong correlation between layer-wise functional saliency and representational compactness, revealing that layers with higher training-induced energy concentration are functionally irreplaceable. Leveraging this insight, we propose a purely geometry-driven sensitivity proxy that enables automatic bit-width allocation under a target average-bit budget without expensive gradient updates or inference-based perplexity probing. Under an average weight bit-width approaching two bits per parameter, LieQ consistently reduces the large accuracy gap typically observed for naive uniform 2-bit baselines on Qwen3 and LLaMA3.x families, while retaining standard-kernel efficiency. These properties make LieQ a practical path toward deploying small language models on resource-constrained edge devices. Code will be available at: https://github.com/HeXiao-55/LieQ-official.git.
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
Hengyuan Zhang | Zhihao Zhang | Ercong Nie | Mingyang Wang | Zunhai Su | Yiwei Wang | Qianli Wang | Shuzhou Yuan | Xufeng Duan | Qibo Xue | Zeping Yu | Chenming Shang | Xiao Liang | Jing Xiong | Hui Shen | Chaofan Tao | Zhengwu Liu | Senjie Jin | Zhiheng Xi | Dongdong Zhang | Sophia Ananiadou | Tao Gui | Ruobing Xie | Hayden Kwok-Hay So | Hinrich Schuetze | Xuanjing Huang | Qi Zhang | Ngai Wong
Findings of the Association for Computational Linguistics: ACL 2026
Hengyuan Zhang | Zhihao Zhang | Ercong Nie | Mingyang Wang | Zunhai Su | Yiwei Wang | Qianli Wang | Shuzhou Yuan | Xufeng Duan | Qibo Xue | Zeping Yu | Chenming Shang | Xiao Liang | Jing Xiong | Hui Shen | Chaofan Tao | Zhengwu Liu | Senjie Jin | Zhiheng Xi | Dongdong Zhang | Sophia Ananiadou | Tao Gui | Ruobing Xie | Hayden Kwok-Hay So | Hinrich Schuetze | Xuanjing Huang | Qi Zhang | Ngai Wong
Findings of the Association for Computational Linguistics: ACL 2026
Mechanistic Interpretability (MI) has emerged as a vital approach to demystify the opaque decision-making of Large Language Models (LLMs). However, existing reviews primarily treat MI as an observational science, summarizing analytical insights while lacking a systematic framework for actionable intervention. To bridge this gap, we present a practical survey structured around the pipeline: "Locate, Steer, and Improve." We formally categorize Localizing (diagnosis) and Steering (intervention) methods based on specific Interpretable Objects to establish a rigorous intervention protocol. Furthermore, we demonstrate how this framework enables tangible improvements in Alignment, Capability, and Efficiency, effectively operationalizing MI as a practical engineering toolkit for model optimization. The curated paper list of this work is available at https://anonymous.4open.science/r/Act-MI-F068.
Search
Fix author
Co-authors
- Zhengwu Liu 2
- Ngai Wong 2
- Sophia Ananiadou 1
- Xufeng Duan 1
- Tao Gui 1
- Xuan-Jing Huang (黄萱菁) 1
- Senjie Jin 1
- Xiao Liang (梁霄) 1
- Haobo Liu 1
- Ercong Nie 1
- Hinrich Schuetze 1
- Chenming Shang 1
- Hui Shen 1
- Hayden Kwok-Hay So 1
- Chaofan Tao 1
- Mingyang Wang 1
- Yiwei Wang 1
- Qianli Wang 1
- Wendong XU 1
- Zhiheng Xi 1
- He Xiao 1
- Dirui Xie 1
- Ruobing Xie 1
- Jing Xiong 1
- Qibo Xue 1
- Qingyao Yang 1
- Runming Yang 1
- Zeping Yu 1
- Shuzhou Yuan 1
- Hengyuan Zhang 1
- Zhihao Zhang 1
- Dongdong Zhang 1
- Qi Zhang 1
- Wenyong Zhou 1