Ziqiao Wang
2025
Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL
Wei Yao
|
Wenkai Yang
|
Ziqiao Wang
|
Yankai Lin
|
Yong Liu
Findings of the Association for Computational Linguistics: ACL 2025
As large language models advance toward superhuman performance, ensuring their alignment with human values and abilities grows increasingly complex. Weak-to-strong generalization offers a promising approach by leveraging predictions from weaker models to guide stronger systems, but its effectiveness could be constrained by the inherent noise and inaccuracies in these weak predictions. To address this, we propose a theoretically grounded approach that replaces forward KL divergence—whose mass-covering behavior risks overfitting to imperfect weak signals—with reverse KL divergence. Reverse KL divergence’s zero-forcing effect prioritizes high-confidence predictions, effectively mitigating the influence of unreliable weak supervision. Theoretically, we extend existing bounds and derive tighter lower bounds for both forward and reverse KL divergence. Notably, when a sufficiently pre-trained strong model is fine-tuned on the last linear layer, reverse KL guarantees that it outperforms its weak supervisor by the magnitude of their disagreement. Empirically, we demonstrate that reverse KL and reverse cross-entropy not only enable strong models to outperform those trained with forward KL and standard cross-entropy across most settings, but also exhibit greater robustness to noisy labels.
Dynamic Task Vector Grouping for Efficient Multi-Task Prompt Tuning
Peiyi Zhang
|
Richong Zhang
|
Zhijie Nie
|
Ziqiao Wang
Findings of the Association for Computational Linguistics: ACL 2025
Multi-task prompt tuning utilizes multiple high-resource source tasks to improve performance on low-source target tasks. Existing approaches transfer the soft prompt trained by combining all source tasks or a single “high-similar” source task one-time-only. However, we find that the optimal transfer performance often comes from a combination of source tasks, which is neither one nor all. Further, we find that the similarity between source and target tasks also changes dynamically during fine-tuning after transfering, making similarity calculation in the initiation stage inadequate. To address these issues, we propose a method called Dynamic Task Vector Grouping (DTVG), whose core ideas contain (1) measuring the task similarity with task vectors instead of soft prompt, (2) grouping the optimal source task combination based on two metrics: target similarity and knowledge consistency; (3) dynamically updating the combination in each iteration step. Extensive experiments on the 26 NLP datasets under different settings demonstrate that DTVG effectively groups similar source tasks while reducing negative transfer, achieving the start-of-art performance.