Kaiyuan Zhang
Other people with similar names: Kaiyuan Zhang
Unverified author pages with similar names: Kaiyuan Zhang
2026
WSDPO: A Generative Word Sense Disambiguation Framework with Chain-of-Thought and Preference Optimization
Kunpeng Kang | Shuaimin Li | Kaiyuan Zhang | Luyang Zhang | Jiasheng Si | Bing Xu | Kehai Chen | Muyun Yang | Wenpeng Lu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Kunpeng Kang | Shuaimin Li | Kaiyuan Zhang | Luyang Zhang | Jiasheng Si | Bing Xu | Kehai Chen | Muyun Yang | Wenpeng Lu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Word sense disambiguation (WSD) is a foundational task in natural language processing. Recent research has reformulated WSD for large language models (LLMs) as a generative task, where the model produces a definition to convey the intended meaning of an ambiguous word in context.In practice, most existing approaches implement this formulation through straightforward supervised fine-tuning, which tends to prioritize superficial context-to-gloss memorization over true contextual sense discrimination, leading to degraded performance on less frequent senses (LFS), particularly in unseen settings.To address this issue, we propose WSDPO, a training framework for generative WSD with chain-of-thought (CoT) and preference optimization. WSDPO consists of three stages: (1) disambiguation-aware CoT construction, which produces training data containing explicit disambiguation steps for the later stage;(2) disambiguation-guided supervised fine-tuning, which explicitly trains the model to discriminate word sense before generating the final definition; and(3) preference-based optimization, which further strengthens the model’s ability to generate sense-faithful definitions by optimizing it using preference pairs constructed from multiple sampled CoT outputs.Extensive experiments across benchmark datasets and multiple backbone LLMs demonstrate that WSDPO achieves substantial performance gains on rare and unseen settings, and exhibits strong generalization in standard evaluation settings.