Abstract
Instruction-finetuned large language models (LLMs) gained a huge popularity recently, thanks to their ability to interact with users through conversation. In this work, we aim to evaluate their ability to complete multi-turn tasks and interact with external databases in the context of established task-oriented dialogue benchmarks. We show that in explicit belief state tracking, LLMs underperform compared to specialized task-specific models. Nevertheless, they show some ability to guide the dialogue to a successful ending through their generated responses if they are provided with correct slot values. Furthermore, this ability improves with few-shot in-domain examples.- Anthology ID:
- 2023.sigdial-1.21
- Volume:
- Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Month:
- September
- Year:
- 2023
- Address:
- Prague, Czechia
- Editors:
- Svetlana Stoyanchev, Shafiq Joty, David Schlangen, Ondrej Dusek, Casey Kennington, Malihe Alikhani
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 216–228
- Language:
- URL:
- https://aclanthology.org/2023.sigdial-1.21
- DOI:
- 10.18653/v1/2023.sigdial-1.21
- Cite (ACL):
- Vojtěch Hudeček and Ondrej Dusek. 2023. Are Large Language Models All You Need for Task-Oriented Dialogue?. In Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 216–228, Prague, Czechia. Association for Computational Linguistics.
- Cite (Informal):
- Are Large Language Models All You Need for Task-Oriented Dialogue? (Hudeček & Dusek, SIGDIAL 2023)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2023.sigdial-1.21.pdf