@inproceedings{yu-etal-2023-prompt,
    title = "Prompt-Based {M}onte-{C}arlo Tree Search for Goal-oriented Dialogue Policy Planning",
    author = "Yu, Xiao  and
      Chen, Maximillian  and
      Yu, Zhou",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2023.emnlp-main.439/",
    doi = "10.18653/v1/2023.emnlp-main.439",
    pages = "7101--7125",
    abstract = "Planning for goal-oriented dialogue often requires simulating future dialogue interactions and estimating task progress. Many approaches thus consider training neural networks to perform look-ahead search algorithms such as A* search and Monte Carlo Tree Search (MCTS). However, this training often require abundant annotated data, which creates challenges when faced with noisy annotations or low-resource settings. We introduce GDP-Zero, an approach using Open-Loop MCTS to perform goal-oriented dialogue policy planning without any model training. GDP-Zero prompts a large language model to act as a policy prior, value function, user simulator, and system model during the tree search. We evaluate GDP-Zero on the goal-oriented task PersuasionForGood, and find that its responses are preferred over ChatGPT up to 59.32{\%} of the time, and are rated more persuasive than ChatGPT during interactive evaluations."
}Markdown (Informal)
[Prompt-Based Monte-Carlo Tree Search for Goal-oriented Dialogue Policy Planning](https://preview.aclanthology.org/ingest-emnlp/2023.emnlp-main.439/) (Yu et al., EMNLP 2023)
ACL