Hayden Kwok-Hay So
2026
Find Your Optimal Teacher: Personalized Data Synthesis via Router-Guided Multi-Teacher Distillation
Hengyuan Zhang | Shiping Yang | Xiao Liang | Chenming Shang | Yuxuan Jiang | Chaofan Tao | Jing Xiong | Hayden Kwok-Hay So | Ruobing Xie | Angel X Chang | Ngai Wong
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hengyuan Zhang | Shiping Yang | Xiao Liang | Chenming Shang | Yuxuan Jiang | Chaofan Tao | Jing Xiong | Hayden Kwok-Hay So | Ruobing Xie | Angel X Chang | Ngai Wong
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Training student models on synthetic data generated by strong teacher models is a promising approach to distilling the capabilities of teachers. However, existing studies reveal that stronger models are not always optimal teachers, suggesting a mismatch between the teacher’s output and the student’s learning ability. To address this issue, we propose PerSyn (Personalized data Synthesis), a novel and efficient approach that customizes synthetic data to align with the learning capabilities of the student model. Specifically, our PerSyn method routes each prompt to its optimal teacher via a query-level router that jointly considers the student models’ learnability and teacher models’ response quality. It successfully transfers the synthesis paradigm from the conventional "Generate then Select" to a more efficient manner, i.e., "Route then Generate", eliminating the need for all teacher models to generate parallel responses across the entire prompt set. Extensive experiments across different model families and scales demonstrate that PerSyn consistently outperforms all baselines on six benchmarks, including instruct tuning and math reasoning settings. Further analysis verifies the effectiveness of PerSyn and offers extra insights to propel future research. Our code is available at https://anonymous.4open.science/r/PerSyn-8D85.
2025
GuiLoMo: Allocating Experts and Ranks for LoRA-MoE via Bilevel Optimization with GuidedSelection Vectors
Xinrong Chen | Hengyuan Zhang | Yingmin Qiu | Xiao Liang | Ziyue Li | Guanyu Wang | Weiping Li | Tong Mo | Hayden Kwok-Hay So | Ngai Wong
Findings of the Association for Computational Linguistics: EMNLP 2025
Xinrong Chen | Hengyuan Zhang | Yingmin Qiu | Xiao Liang | Ziyue Li | Guanyu Wang | Weiping Li | Tong Mo | Hayden Kwok-Hay So | Ngai Wong
Findings of the Association for Computational Linguistics: EMNLP 2025
Parameter-efficient fine-tuning (PEFT) methods, particularly Low-Rank Adaptation (LoRA), offer an efficient way to adapt large language models with reduced computational costs. However, their performance is limited by the small number of trainable parameters. Recent work combines LoRA with the Mixture-of-Experts (MoE), i.e., LoRA-MoE, to enhance capacity, but two limitations remain in hindering the full exploitation of its potential: 1) the influence of downstream tasks when assigning expert numbers, and 2) the uniform rank assignment across all LoRA experts, which restricts representational diversity.To mitigate these gaps, we propose GuiLoMo, a fine-grained layer-wise expert numbers and ranks allocation strategy with GuidedSelection Vectors (GSVs). GSVs are learned via a prior bilevel optimization process to capture both model- and task-specific needs, and are then used to allocate optimal expert numbers and ranks.Experiments on three backbone models across diverse benchmarks show that GuiLoMo consistently achieves superior or comparable performance to all baselines. Further analysis offers key insights into how expert numbers and ranks vary across layers and tasks, highlighting the benefits of adaptive expert configuration. Our code is available at https://anonymous.4open.science/r/GuiLoMo-034.
TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review
Yuan Chang | Ziyue Li | Hengyuan Zhang | Yuanbo Kong | Yanru Wu | Hayden Kwok-Hay So | Zhijiang Guo | Liya Zhu | Ngai Wong
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yuan Chang | Ziyue Li | Hengyuan Zhang | Yuanbo Kong | Yanru Wu | Hayden Kwok-Hay So | Zhijiang Guo | Liya Zhu | Ngai Wong
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
While Large Language Models (LLMs) have shown significant potential in assisting peer review, current methods often struggle to generate thorough and insightful reviews while maintaining efficiency. In this paper, we propose TreeReview, a novel framework that models paper review as a hierarchical and bidirectional question-answering process. TreeReview first constructs a tree of review questions by recursively decomposing high-level questions into fine-grained sub-questions and then resolves the question tree by iteratively aggregating answers from leaf to root to get the final review. Crucially, we incorporate a dynamic question expansion mechanism to enable deeper probing by generating follow-up questions when needed. We construct a benchmark derived from ICLR and NeurIPS venues to evaluate our method on full review generation and actionable feedback comments generation tasks. Experimental results of both LLM-based and human evaluation show that TreeReview outperforms strong baselines in providing comprehensive, in-depth, and expert-aligned review feedback, while reducing LLM token usage by up to 80% compared to computationally intensive approaches.