Preserving Fairness and Safety in Quantized LLMs Through Critical Weight Protection

Muhammad Alif Al Hakim, Alfan Farizki Wicaksono, Fajri Koto


Abstract
Quantization is widely adopted to reduce the computational cost of large language models (LLMs); however, its implications for fairness and safety, particularly in dynamic quantization and multilingual contexts, remain underexplored. In this work, we conduct a systematic study of how static and dynamic quantization methods impact fairness and safety across benchmarks measuring intrinsic and extrinsic bias and safety alignment. For fairness, we evaluate English, French, Dutch, Spanish, and Turkish; for safety, we focus on English, Korean, and Arabic. Our findings reveal that quantization consistently degrades fairness and safety, with dynamic methods demonstrating greater stability than static ones. Moreover, fairness degradation varies across languages, while safety deterioration is especially pronounced in non-English settings. To address these risks, we introduce Critical Weight Protection, a novel technique that identifies and preserves fairness- and safety-critical weights during quantization. This approach mitigates bias and safety deterioration without costly retraining or alignment, maintaining trustworthiness while retaining efficiency.
Anthology ID:
2026.findings-acl.993
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19831–19855
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.993/
DOI:
Bibkey:
Cite (ACL):
Muhammad Alif Al Hakim, Alfan Farizki Wicaksono, and Fajri Koto. 2026. Preserving Fairness and Safety in Quantized LLMs Through Critical Weight Protection. In Findings of the Association for Computational Linguistics: ACL 2026, pages 19831–19855, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Preserving Fairness and Safety in Quantized LLMs Through Critical Weight Protection (Al Hakim et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.993.pdf
Checklist:
 2026.findings-acl.993.checklist.pdf