Compatibility-Aware Dynamic Fine-Tuning for Large Language Models

Yucheng Zhou, Junwei Sheng, Qianning Wang, Jianbing Shen


Abstract
Supervised Fine-Tuning (SFT) is the predominant paradigm for aligning large language models (LLMs), yet it suffers from optimization instability and limited generalization. Recent work attributes this issue to pathological gradient scaling and proposes Dynamic Fine-Tuning (DFT) to correct it at the token level. However, DFT assumes all demonstrations are equally suitable learning targets, an assumption violated by the strong heterogeneity of large-scale instruction data, where demonstration-policy mismatch induces high-variance updates at the sample level. We introduce Compatibility-Aware Dynamic Fine-Tuning (CADFT), a principled extension of DFT that controls sample-level optimization variance. CADFT derives a dynamic, policy-dependent compatibility signal from model likelihoods to modulate supervised updates, suppressing high-variance gradients from incompatible demonstrations. We further propose a delayed, low-frequency compatibility-guided rewriting strategy to transform persistently incompatible demonstrations into learnable targets. We show that CADFT can be interpreted as a variance-controlled estimator that generalizes token-level stabilization in DFT to the sample level. Extensive experiments demonstrate improved stability, generalization, and cold-start reinforcement learning initialization, while remaining fully supervised and free of reward modeling.
Anthology ID:
2026.acl-long.1383
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29997–30008
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1383/
DOI:
Bibkey:
Cite (ACL):
Yucheng Zhou, Junwei Sheng, Qianning Wang, and Jianbing Shen. 2026. Compatibility-Aware Dynamic Fine-Tuning for Large Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 29997–30008, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Compatibility-Aware Dynamic Fine-Tuning for Large Language Models (Zhou et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1383.pdf
Checklist:
 2026.acl-long.1383.checklist.pdf