SQLWOZ: A Realistic Task-Oriented Dialogue Dataset with SQL-Based Dialogue State Representation for Complex User Requirements

Heng-Da Xu, Xian-Ling Mao, Fanshu Sun, Tian-Yi Che, Cheng-Xin Xin, Heyan Huang


Abstract
High-quality datasets are essential for building effective task-oriented dialogue (TOD) systems. The existing TOD datasets often present overly simplified interactions, where users incrementally express straightforward requests that can be managed with basic slot-value style dialogue states, such as “hotel-area = east.” However, this approach does not reflect real-life scenarios in which users may express complex constraints and preferences. To address this gap, in this paper, we propose SQLWOZ, a novel TOD dataset designed to capture complex, real-world user requirements. The user requirements in SQLWOZ include the four categories: 1) multiple values for a slot, 2) excluded values within a slot, 3) preferred or prioritized values, and 4) conditional values based on other conditions. We utilize SQL statements as a formalized and expressive representation of dialogue states within SQLWOZ. To evaluate the dataset, we adapt large language models as dialogue agents and conduct extensive experiments on the SQL-based dialogue state tracking, dialogue response generation and end-to-end TOD tasks. The experimental results demonstrate the complexity and quality of SQLWOZ, establishing it as a new benchmark for advancing TOD research.
Anthology ID:
2025.emnlp-main.383
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7537–7562
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.383/
DOI:
Bibkey:
Cite (ACL):
Heng-Da Xu, Xian-Ling Mao, Fanshu Sun, Tian-Yi Che, Cheng-Xin Xin, and Heyan Huang. 2025. SQLWOZ: A Realistic Task-Oriented Dialogue Dataset with SQL-Based Dialogue State Representation for Complex User Requirements. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 7537–7562, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
SQLWOZ: A Realistic Task-Oriented Dialogue Dataset with SQL-Based Dialogue State Representation for Complex User Requirements (Xu et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.383.pdf
Checklist:
 2025.emnlp-main.383.checklist.pdf