Empirical Analysis of Task Mixture Effects in Small-scale Instruction Tuning: A Statistical Approach

Jeesu Jung; Sangkeun Jung

Empirical Analysis of Task Mixture Effects in Small-scale Instruction Tuning: A Statistical Approach

Abstract

The performance of large language models heavily depends on instruction tuning, especially on task types and mixture ratios. However, previous research has primarily focused on mixing tasks at fixed ratios, lacking a **systematic and quantitative analysis of task-wise interactions across diverse tasks**. Moreover, it has relied heavily on human labeling. To address these limitations, this study conducts empirical experiments on unlabeled instruction corpora, varying both the number and proportion of task combinations to identify effective mixtures. To minimize manual labeling, we automatically extract five representative tasks—programming, math problem solving, history question answering, grammar correction, and creative writing—using only a few seed instructions. Across 51 mixtures, we find that 1–2 task mixtures work best with small datasets, while synergistic 3-task mixtures excel with larger data. Task interactions reveal both synergy (e.g., programming + math) and interference (e.g., programming + creative writing). These results provide practical guidelines for mixture design tailored to model scale and data size.

Anthology ID:: 2026.findings-acl.643
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13168–13186
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.643/
DOI:
Bibkey:
Cite (ACL):: Jeesu Jung and Sangkeun Jung. 2026. Empirical Analysis of Task Mixture Effects in Small-scale Instruction Tuning: A Statistical Approach. In Findings of the Association for Computational Linguistics: ACL 2026, pages 13168–13186, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Empirical Analysis of Task Mixture Effects in Small-scale Instruction Tuning: A Statistical Approach (Jung & Jung, Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.643.pdf
Checklist:: 2026.findings-acl.643.checklist.pdf

PDF Cite Search Checklist Fix data