Xiaotian Zhang


2025

pdf bib
DiffPO: Diffusion-styled Preference Optimization for Inference Time Alignment of Large Language Models
Ruizhe Chen | Wenhao Chai | Zhifei Yang | Xiaotian Zhang | Ziyang Wang | Tony Quek | Joey Tianyi Zhou | Soujanya Poria | Zuozhu Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Inference-time alignment provides an efficient alternative for aligning LLMs with humans. However, these approaches still face challenges, such as limited scalability due to policy-specific value functions and latency during the inference phase. In this paper, we propose a novel approach, Diffusion-styled Preference Optimization (DiffPO), which provides an efficient and policy-agnostic solution for aligning LLMs with humans. By directly performing alignment at sentence level, DiffPO avoids the time latency associated with token-level generation. Designed as a plug-and-play module, DiffPO can be seamlessly integrated with various base models to enhance their alignment. Extensive experiments on AlpacaEval 2, MT-bench, and HH-RLHF demonstrate that DiffPO achieves superior alignment performance across various settings, achieving a favorable trade-off between alignment quality and inference-time latency. Furthermore, DiffPO demonstrates model-agnostic scalability, significantly improving the performance of large models such as Llama-3-70B.

pdf bib
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Xiaotian Zhang | Ruizhe Chen | Yang Feng | Zuozhu Liu
Findings of the Association for Computational Linguistics: ACL 2025

Aligning language models with human preferences presents significant challenges, particularly in achieving personalization without incurring excessive computational costs. Existing methods rely on reward signals and additional annotated data, limiting their scalability and adaptability to diverse human values. To address these challenges, we introduce Persona-judge, a novel discriminative paradigm that enables training-free personalized alignment with unseen preferences. Instead of optimizing policy parameters through external reward feedback, Persona-judge leverages the intrinsic preference judgment capabilities of the model. Specifically, a draft model generates candidate tokens conditioned on a given preference, while a judge model, embodying another preference, cross-validates the predicted tokens whether to be accepted. Experimental results demonstrate that Persona-judge, using the inherent preference evaluation mechanisms of the model, offers a scalable and computationally efficient solution to personalized alignment, paving the way for more adaptive customized alignment. Our code is available here.

2023

pdf bib
Investigating Glyph-Phonetic Information for Chinese Spell Checking: What Works and What’s Next?
Xiaotian Zhang | Yanjun Zheng | Hang Yan | Xipeng Qiu
Findings of the Association for Computational Linguistics: ACL 2023

While pre-trained Chinese language models have demonstrated impressive performance on a wide range of NLP tasks, the Chinese Spell Checking (CSC) task remains a challenge. Previous research has explored using information such as glyphs and phonetics to improve the ability of CSC models to distinguish misspelled characters, with good results at the accuracy level on public datasets. However, the generalization ability of these CSC models has not been well understood: it is unclear whether they incorporate glyph-phonetic information and, if so, whether this information is fully utilized. In this paper, we aim to better understand the role of glyph-phonetic information in the CSC task and suggest directions for improvement. Additionally, we propose a new, more challenging, and practical setting for testing the generalizability of CSC models. All code is made publicly available.

pdf bib
Multijugate Dual Learning for Low-Resource Task-Oriented Dialogue System
Shimin Li | Xiaotian Zhang | Yanjun Zheng | Linyang Li | Xipeng Qiu
Findings of the Association for Computational Linguistics: ACL 2023

Dialogue data in real scenarios tend to be sparsely available, rendering data-starved end-to-end dialogue systems trained inadequately. We discover that data utilization efficiency in low-resource scenarios can be enhanced by mining alignment information uncertain utterance and deterministic dialogue state. Therefore, we innovatively implement dual learning in task-oriented dialogues to exploit the correlation of heterogeneous data. In addition, the one-to-one duality is converted into a multijugate duality to reduce the influence of spurious correlations in dual training for generalization. Without introducing additional parameters, our method could be implemented in arbitrary networks. Extensive empirical analyses demonstrate that our proposed method improves the effectiveness of end-to-end task-oriented dialogue systems under multiple benchmarks and obtains state-of-the-art results in low-resource scenarios.

2012

pdf bib
A Machine Learning Approach to Convert CCGbank to Penn Treebank
Xiaotian Zhang | Hai Zhao | Cong Hui
Proceedings of COLING 2012: Demonstration Papers

pdf bib
Chinese Coreference Resolution via Ordered Filtering
Xiaotian Zhang | Chunyang Wu | Hai Zhao
Joint Conference on EMNLP and CoNLL - Shared Task

2010

pdf bib
Dependency Parser for Chinese Constituent Parsing
Xuezhe Ma | Xiaotian Zhang | Hai Zhao | Bao-Liang Lu
CIPS-SIGHAN Joint Conference on Chinese Language Processing