SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

Yuxin Xiao, Shujian Zhang, Marzyeh Ghassemi, Wenxuan Zhou


Abstract
To acquire instruction-following capabilities, large language models (LLMs) undergo instruction tuning, where they are trained on instruction-response pairs using next-token prediction (NTP). Efforts to improve instruction tuning often focus on higher-quality supervised fine-tuning (SFT) datasets, typically requiring data filtering with proprietary LLMs or human annotation. In this paper, we take a different approach by proposing SFTMix, a novel Mixup-based recipe that elevates LLM instruction tuning without relying on well-curated datasets. We observe that LLMs exhibit uneven confidence across the semantic representation space. We argue that examples with different confidence levels should play distinct roles in instruction tuning: Confident data is prone to overfitting, while unconfident data is harder to generalize. Based on this insight, SFTMix leverages training dynamics to identify examples with varying confidence levels. We then interpolate them to bridge the confidence gap and apply a Mixup-based regularization to support learning on these additional, interpolated examples. We demonstrate the effectiveness of SFTMix in both instruction-following and healthcare-specific SFT tasks, with consistent improvements across LLM families and SFT datasets of varying sizes and qualities. Extensive analyses across six directions highlight SFTMix’s compatibility with data selection, adaptability to compute-constrained scenarios, and scalability to broader applications.
Anthology ID:
2026.acl-long.78
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1705–1718
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.78/
DOI:
Bibkey:
Cite (ACL):
Yuxin Xiao, Shujian Zhang, Marzyeh Ghassemi, and Wenxuan Zhou. 2026. SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1705–1718, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe (Xiao et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.78.pdf
Checklist:
 2026.acl-long.78.checklist.pdf