@inproceedings{dao-liao-2025-one,
    title = "One Planner To Guide Them All ! Learning Adaptive Conversational Planners for Goal-oriented Dialogues",
    author = "Dao, Huy Quang  and
      Liao, Lizi",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1123/",
    pages = "22103--22127",
    ISBN = "979-8-89176-332-6",
    abstract = "Goal-oriented dialogues, such as recommendation and negotiation, often require balancing multiple, conflicting objectives. Existing methods typically involve training separate models for specific combinations of objectives, leading to computational and scalability issues. In this work, we aim to develop a new dialogue policy method that can adapt to varying objective preferences at inference time without retraining. This raises several challenges in terms of both (1) optimization strategy and (2) knowledge utilization. To address these, we propose a novel learning framework, Preference Adaptive Dialogue Policy Planner (PADPP), for multi-objective goal-oriented dialogues. Specifically, to tackle the former, we introduce a novel policy optimization scheme, which leverages information gained from training the model on previously updated objective weights, accelerating the learning capability on new weight settings. To address the latter, we utilize Generalized Policy Improvement (GPI) to ensure the effectiveness of leveraged knowledge. Experimental results demonstrate that PADPP achieves superior adaptability and performance compared to state-of-the-art approaches, offering a scalable and flexible solution for multi-objective, goal-oriented dialogues. Code and data are available at the anonymous link."
}Markdown (Informal)
[One Planner To Guide Them All ! Learning Adaptive Conversational Planners for Goal-oriented Dialogues](https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1123/) (Dao & Liao, EMNLP 2025)
ACL