Fusion Training for Mathematical Generalization in Large Language Models

Congfeng Cao, Pengyu Zhang, Jelke Bloem


Abstract
Thinking Mode Fusion (TMF) enables large language models to support both concise responses and long-form reasoning by unifying a non-thinking mode and a thinking mode within a single model. However, its training dynamics, including the data ratio and training schedule between the two modes, remain underexplored. In this work, we present a systematic study of TMF by analyzing the effects of the training schedule and data ratio between thinking and non-thinking modes. Focusing on mathematical problem solving, we construct a benchmark with multiple thinking-to-non-thinking data ratios and three training schedules. Our results reveal an asymmetric interaction between the two modes: increasing the ratio of non-thinking supervision reduces the accuracy of the thinking mode. We further show that different training schedules modulate this trade-off and that the optimal schedule depends on the data ratio. Finally, we quantify a negative correlation between non-thinking and thinking mode supervision, highlighting an inherent tension between these two modes. These findings provide practical guidance for designing effective TMF training settings. All code and data are released to support further research at: Fusion Bench.
Anthology ID:
2026.acl-srw.64
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
712–724
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.64/
DOI:
Bibkey:
Cite (ACL):
Congfeng Cao, Pengyu Zhang, and Jelke Bloem. 2026. Fusion Training for Mathematical Generalization in Large Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 712–724, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Fusion Training for Mathematical Generalization in Large Language Models (Cao et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.64.pdf