Zhanming Shen
2026
Fine-Grained Data Ordering Improves Fine-Tuning for Large Language Models
Xiaomeng Hu | Yixuan Tang | Haoze Li | Hao Chen | Qi Zhang | Zhanming Shen | Yiming Zhang | Haobo Wang | Junbo Zhao
Findings of the Association for Computational Linguistics: ACL 2026
Xiaomeng Hu | Yixuan Tang | Haoze Li | Hao Chen | Qi Zhang | Zhanming Shen | Yiming Zhang | Haobo Wang | Junbo Zhao
Findings of the Association for Computational Linguistics: ACL 2026
With the rapid progress of large language models (LLMs), aligning a general-purpose model with downstream tasks through fine-tuning has become a central research focus. Selecting only high-quality examples for training has been shown to be one of the most effective ways to improve fine-tuning performance. However, prior work concentrates almost exclusively on data preprocessing: filtering and cleaning data before training begins. While the order and composition of training data during training have received little fine-grained attention. To fill this gap, our work proposed Fine-Grained Order Fine-Tuning, a fine-grained scheduling method of data order in epochs. Drawing on curriculum-learning principles, FOT defines data difficulty based on the relevance between the data and the model, and then performs dynamic scheduling of the training order in each epoch according to the difficulty. On both large-scale continued pre-training and small-scale supervised fine-tuning experiments, FOT has achieved an average 2.4% improvement over baselines. Our study offers a new perspective on data governance in the fine-tuning phase.
2025
CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency
Zhanming Shen | Hao Chen | Yulei Tang | Shaolin Zhu | Wentao Ye | Xiaomeng Hu | Haobo Wang | Gang Chen | Junbo Zhao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Zhanming Shen | Hao Chen | Yulei Tang | Shaolin Zhu | Wentao Ye | Xiaomeng Hu | Haobo Wang | Gang Chen | Junbo Zhao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Instruction tuning is vital for aligning large language models (LLMs) with human intent, but current methods typically rely on costly human-annotated seed data or powerful external teacher models. While instruction back-translation techniques reduce this dependency, they remain fundamentally tethered to an initial seed set, which limits full automation, introduces biases, and can lead to inefficient use of unlabeled corpora. In this paper, we propose Cycle-Instruct, a novel framework that achieves fully seed-free instruction tuning. Inspired by cycle consistency, Cycle-Instruct employs a dual self-training loop where two models—an answer generator and a question generator—are bootstrapped solely from raw, unlabeled text. These models mutually supervise each other by reconstructing original text segments from their counterpart’s generated pseudo-labels, effectively learning from the intrinsic structure of the data without any human-provided seeds. We demonstrate Cycle-Instruct’s efficacy across four diverse data tracks, including general instruction-following, domain-specific tasks, dialogue logs, and plain text. Our extensive experiments show that Cycle-Instruct not only outperforms seed-driven back-translation baselines but also achieves performance comparable to strongly supervised methods.
pFedGPT: Hierarchically Optimizing LoRA Aggregation Weights for Personalized Federated GPT Models
Zhanming Shen | Tianqi Xu | Hao Wang | Jian Li | Miao Pan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Zhanming Shen | Tianqi Xu | Hao Wang | Jian Li | Miao Pan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Federated finetuning of Large Language Models (LLMs) using Low-Rank Adaptation (LoRA) offers computational efficiency and preserves data privacy. However, applying LoRA in federated settings faces significant challenges: standard approaches struggle with data heterogeneity, and existing personalization techniques fail to precisely adapt shared global knowledge to individual client needs. To address these issues, we propose pFedGPT, a framework that leverages Hierarchical Bayesian Optimization (HBO) for fine-grained, personalized LoRA aggregation. pFedGPT intelligently partitions LoRA parameters based on model structure and client information, then employs HBO to hierarchically search for optimal, module-specific weights. This enables a nuanced integration of the downloaded global LoRA state with each client’s local model, precisely capturing client-specific requirements. To manage the optimization cost inherent in HBO, pFedGPT incorporates efficient multi-fidelity evaluations and a curriculum learning strategy. Extensive experiments demonstrate that pFedGPT achieves state-of-the-art (SOTA) performance on personalized FL benchmarks, showcasing robustness and scalability while introducing only minimal (approx. 4%) additional optimization overhead. Our results also underscore the limitations of traditional FL methods for LoRA-based LLM personalization, highlighting the need for tailored approaches like pFedGPT.