Xinyu Luo
Also published as: 昕宇 罗
2025
Reward-Shifted Speculative Sampling Is An Efficient Test-Time Weak-to-Strong Aligner
Bolian Li
|
Yanran Wu
|
Xinyu Luo
|
Ruqi Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Aligning large language models (LLMs) with human preferences has become a critical step in their development. Recent research has increasingly focused on test-time alignment, where additional compute is allocated during inference to enhance LLM safety and reasoning capabilities. However, these test-time alignment techniques often incur substantial inference costs, limiting their practical application. We are inspired by the speculative sampling acceleration, which leverages a small draft model to efficiently predict future tokens, to address the efficiency bottleneck of test-time alignment. We introduce the reward-shifted speculative sampling (SSS) algorithm, in which the draft model is aligned with human preferences, while the target model remains unchanged. We theoretically demonstrate that the distributional shift between the aligned draft model and the unaligned target model can be exploited to recover the RLHF optimal solution without actually obtaining it, by modifying the acceptance criterion and bonus token distribution. Our algorithm achieves superior gold reward scores at a significantly reduced inference cost in test-time weak-to-strong alignment experiments, thereby validating both its effectiveness and efficiency.
2020
汉语学习者依存句法树库构建(Construction of a Treebank of Learner Chinese)
Jialu Shi (师佳璐)
|
Xinyu Luo (罗昕宇)
|
Liner Yang (杨麟儿)
|
Dan Xiao (肖丹)
|
Zhengsheng Hu (胡正声)
|
Yijun Wang (王一君)
|
Jiaxin Yuan (袁佳欣)
|
Yu Jingsi (余婧思)
|
Erhong Yang (杨尔弘)
Proceedings of the 19th Chinese National Conference on Computational Linguistics
汉语学习者依存句法树库为非母语者语料提供依存句法分析,可以支持第二语言教学与研究,也对面向第二语言的句法分析、语法改错等相关研究具有重要意义。然而,现有的汉语学习者依存句法树库数量较少,且在标注方面仍存在一些问题。为此,本文改进依存句法标注规范,搭建在线标注平台,并开展汉语学习者依存句法标注。本文重点介绍了数据选取、标注流程等问题,并对标注结果进行质量分析,探索二语偏误对标注质量与句法分析的影响。
Search
Fix author
Co-authors
- Zhengsheng Hu (胡正声, 胡正升) 1
- Yu Jingsi (余婧思) 1
- Bolian Li 1
- Jialu Shi (师佳璐) 1
- Yijun Wang (王一君) 1
- show all...