Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding
Taowen Liu, Marta Andronic, Deniz Gunduz, George Anthony Constantinides
Abstract
LLM training is resource-intensive. Quantized training improves computational and memory efficiency but introduces quantization noise, which can hinder convergence and degrade model accuracy. Stochastic Rounding (SR) has emerged as a theoretically attractive alternative to deterministic rounding, offering unbiased gradient estimates. However, its interaction with other training factors—especially batch size—remains underexplored. In this paper, we present a theoretical and empirical study of mini-batch stochastic gradient descent (SGD) with SR, showing that increased batch sizes can compensate for reduced precision during backpropagation. Furthermore, we show that quantizing weights and activations impacts gradient variance in distinct ways. Our experiments validate these theoretical insights. Our experiments validate these theoretical insights.- Anthology ID:
- 2025.findings-emnlp.784
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 14531–14546
- Language:
- URL:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.784/
- DOI:
- 10.18653/v1/2025.findings-emnlp.784
- Cite (ACL):
- Taowen Liu, Marta Andronic, Deniz Gunduz, and George Anthony Constantinides. 2025. Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 14531–14546, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Training with Fewer Bits: Unlocking Edge LLMs Training with Stochastic Rounding (Liu et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.784.pdf