Lost in Quantization: Activation Outliers Explain Language-Specific FP8 Sensitivity in Llama-3
Guilherme Silva, Pedro Silva, Matheus Peixoto, Gladston Moreira, Eduardo Luz
Abstract
Quantization is key for efficient LLM inference, but its language-specific effects are understudied. We compare INT8 and FP8 (E4M3) quantization for Meta-Llama-3-8B on English and Brazilian Portuguese (PT-BR). INT8 with outlier handling preserves perplexity in both languages, while naive FP8 casting degrades English far more than PT-BR (+18% vs. +3.9%). Activation analysis shows rarer, larger English spikes (>35) that are more prone to saturation under unscaled E4M3, whereas PT-BR activations are more concentrated. Our FP8 results reflect a naive casting stress test (no calibration/scaling), not an optimized FP8 recipe.- Anthology ID:
- 2026.propor-1.108
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1044–1048
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.108/
- DOI:
- Cite (ACL):
- Guilherme Silva, Pedro Silva, Matheus Peixoto, Gladston Moreira, and Eduardo Luz. 2026. Lost in Quantization: Activation Outliers Explain Language-Specific FP8 Sensitivity in Llama-3. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 1044–1048, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- Lost in Quantization: Activation Outliers Explain Language-Specific FP8 Sensitivity in Llama-3 (Silva et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.108.pdf