Didi Zhang
2025
Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and Correction
Didi Zhang
|
Yaxin Fan
|
Peifeng Li
|
Qiaoming Zhu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Goal-oriented proactive dialogue systems are designed to guide user conversations seamlessly towards specific objectives by planning a goal-oriented path. However, previous research has focused predominantly on optimizing these paths while neglecting the inconsistencies that may arise between generated responses and dialogue contexts, including user profiles, dialogue history, domain knowledge, and subgoals. To address this issue, we introduce a model-agnostic two-stage Consistency Reflection and Correction (CRC) framework. Specifically, in the consistency reflection stage, the model is prompted to reflect on the discrepancies between generated responses and dialogue contexts, identifying inconsistencies and suggesting possible corrections. In the consistency correction stage, the model generates responses that are more consistent with the dialogue context based on these reflection results. We conducted experiments on various model architectures with different parameter sizes, including encoder-decoder models (BART, T5) and decoder-only models (GPT-2, DialoGPT, Phi3, Mistral and LLaMA3), and the experimental results on three datasets demonstrate that our CRC framework significantly improves the consistency between generated responses and dialogue contexts.
Enhancing Goal-oriented Proactive Dialogue Systems via Dynamic Multi-dimensional Consistency Optimization
Didi Zhang
|
Yaxin Fan
|
Peifeng Li
|
Qiaoming Zhu
Findings of the Association for Computational Linguistics: EMNLP 2025
Previous work on goal-oriented proactive dialogue systems frequently failed to address the multi-dimensional consistency issue between generated responses and key contextual elements (e.g., user profile, dialogue history, domain knowledge, and subgoal). To address this issue, we propose a novel Dynamic Multi-dimensional Consistency Reinforcement Learning (DMCRL) framework, which adaptively measures the impact of each consistency dimension on overall dialogue quality and provides targeted feedback to improve response quality. Experimental results on two datasets demonstrate that our DMCRL significantly improves the consistency of generated responses.