Kaibin Tian


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Improving Preference Alignment of LLM with Inference-Free Self-Refinement
Fukun Ma | Kaibin Tian | Jieting Xue | Xiaoyi Wang | Ye Ma | Quan Chen | Peng Jiang | Lijie Wen
Findings of the Association for Computational Linguistics: EMNLP 2025

Large language models (LLMs) develop the in-context learning capability through pretraining and instruction tuning, enabling task adaptation without parameter updates. Self-refinement is a manifestation of this capability, which allows LLMs to iteratively refine the output using self-generated feedback. However, empirical observations reveal Inference-Free Self-Refinement (IFSR) in preference alignment: LLMs generate preference-improved output via fixed instructions, requiring no specific feedback, even no initial responses. There are two key components of the IFSR in preference alignment. The refining instruction is a fixed instruction that constrains the output distribution from a preference-semantic perspective. During training, it facilitates joint learning of preference-related semantic representations and data distribution alignment. The pseudo reference response is constructed from paired preference data and serves as a demonstration to guide the output distribution. It mitigates off-policy distributional bias while enhancing token-level preference learning in training. Experiments across multiple datasets demonstrate that incorporating IFSR into preference alignment yields performance improvement over 10%. Further ablation studies reveal additional characteristics and potential principles of IFSR.