Shixuan Liu
Also published as: 世萱 刘
2026
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models
Binghai Wang | Yantao Liu | Yuxuan Liu | Tianyi Tang | Shenzhi Wang | Chang Gao | Chujie Zheng | Yichang Zhang | Le Yu | Shixuan Liu | Tao Gui | Qi Zhang | Xuanjing Huang | Bowen Yu | Fei Huang | Junyang Lin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Binghai Wang | Yantao Liu | Yuxuan Liu | Tianyi Tang | Shenzhi Wang | Chang Gao | Chujie Zheng | Yichang Zhang | Le Yu | Shixuan Liu | Tao Gui | Qi Zhang | Xuanjing Huang | Bowen Yu | Fei Huang | Junyang Lin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Generative Reward Models (GenRMs) and LLM-as-a-Judge exhibit deceptive alignment by producing correct judgments for incorrect reasons, as they are trained and evaluated to prioritize Outcome Accuracy, which undermines their ability to generalize during RLHF. We introduce Rationale Consistency, a fine-grained metric that quantifies the alignment between the model’s reasoning process and human judgment. Our evaluation of frontier models reveals that rationale consistency effectively discriminates among state-of-the-art models and detects deceptive alignment, while outcome accuracy falls short in both respects. To mitigate this gap, we introduce a hybrid signal that combines rationale consistency with outcome accuracy for GenRM training. Our training method achieves state-of-the-art performance on RM-Bench (87.1%) and JudgeBench (82%), surpassing outcome-only baselines by an average of 5%. Using RM during RLHF, our method effectively improves performance as demonstrated on Arena Hard v2, notably yielding a 7% improvement in creative writing tasks. Further analysis confirms that our method escapes the deceptive alignment trap, effectively reversing the decline in rationale consistency observed in outcome-only training.
2024
TeleChat: An Open-source Billingual Large Language Model
Zihan Wang | XinZhang Liu | Shixuan Liu | Yitong Yao | Yunyao Huang | Mengxiang Li | Zhongjiang He | Yongxian Li | Luwen Pu | Huinan Xu | Chao Wang | Shuangyong Song
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10)
Zihan Wang | XinZhang Liu | Shixuan Liu | Yitong Yao | Yunyao Huang | Mengxiang Li | Zhongjiang He | Yongxian Li | Luwen Pu | Huinan Xu | Chao Wang | Shuangyong Song
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10)
In this paper, we present TeleChat, a collection of large language models (LLMs) with parameters of 7 billion and 12 billion. TeleChat is initially pretrained on an extensive corpus containing a diverse collection of texts from both English and Chinese languages, encompassing trillions of tokens. Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe. We evaluate the performance of TeleChat on various tasks, including general dialogue generation, language understanding, mathematics, reasoning, code generation, and knowledge-based question answering. Our findings indicate that TeleChat achieves state-of-the-art performance to other open-source models of similar size across a wide range of public benchmarks. To support future research and applications utilizing LLMs, we release the fine-tuned model checkpoints of TeleChat-7B and TeleChat-12B, along with code and a portion of our filtered high-quality pretraining data, to the public community.
2023
CCL23-Eval 任务7赛道一系统报告:基于序列到序列模型的自动化文本纠错系统(System Report for CCL23-Eval Task 7 Track 1: Automated text error correction pipeline based on sequence-to-sequence models)
Shixuan Liu (刘世萱) | Xinzhang Liu (刘欣璋) | Yuyao Huang (黄钰瑶) | Chao Wang (王超) | Yongshuang Song (宋双永)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
Shixuan Liu (刘世萱) | Xinzhang Liu (刘欣璋) | Yuyao Huang (黄钰瑶) | Chao Wang (王超) | Yongshuang Song (宋双永)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
“本文介绍了本队伍在CCL-2023汉语学习者文本纠错评测大赛赛道一中提交的参赛系统。近年来,大规模的中文预训练模型在各种任务上表现出色,而不同的预训练模型在特定任务上也各有优势。然而,由于汉语学习者文本纠错任务存在语法错误复杂和纠错语料稀缺等特点,因此采用基于序列标记的预训练文本纠错模型来解决问题是自然的选择。我们的团队采用了序列到序列的纠错模型,并采取了两阶段训练策略,设计了一套基于序列到序列文本纠错的pipeline。首先,我们对训练集数据进行了清洗处理;在第一阶段训练中,我们在训练集上使用数据增强技术;在第二阶段,我们利用验证集进行微调,并最终采用多个模型投票集成的方式完成后处理。在实际的系统测评中,我们提交的结果在封闭任务排行榜上超出baseline模型17.01分(40.59->57.6)。”
Search
Fix author
Co-authors
- Xinzhang Liu 2
- Chao Wang 2
- Chang Gao 1
- Tao Gui 1
- Zhongjiang He 1
- Xuan-Jing Huang (黄萱菁) 1
- Fei Huang 1
- Yunyao Huang 1
- Yuyao Huang 1
- Mengxiang Li (李孟祥) 1
- Yongxian Li 1
- Junyang Lin 1
- Yantao Liu 1
- Yuxuan Liu 1
- Luwen Pu 1
- Shuangyong Song (宋双永) 1
- Yongshuang Song 1
- Tianyi Tang 1
- Binghai Wang 1
- Shenzhi Wang 1
- Zihan Wang 1
- Huinan Xu 1
- Yitong Yao 1
- Le Yu 1
- Bowen Yu 1
- Yichang Zhang 1
- Qi Zhang 1
- Chujie Zheng 1