MTO: A Multi-turn Conversational Text-to-OverpassQL Dataset for Enhanced OpenStreetMap Query Generation
Haodi Zhang, Xinrui Zhu, Mingze Kong, Zhidan Liu, Tao Fan, Kaishun Wu, Yuanfeng Song
Abstract
We propose a comprehensive framework for constructing multi-turn Text-to-OverpassQL dialogue datasets. Under this framework, we introduce the first multi-turn Text-to-OverpassQL dataset built upon the OverpassNL corpus. Our dataset comprises over 7,800 dialogues, each containing 2 to 4 user utterances, resulting in more than 20,000 individual utterances aligned with executable Overpass queries. To generate high-quality multi-turn dialogues, we design a four-stage pipeline. First, we convert Overpass queries into syntax trees using a custom parser developed based on the official OverpassQL grammar. This enables structural manipulation while preserving syntactic and executable validity. Second, we apply a diverse set of tree-editing templates, including both simple keyword-level changes and complex structural decompositions, to produce multiple valid and diverse Overpass queries. Third, we leverage a prompt-based approach to guide large language models in generating context-aware natural language questions, ensuring increasing inter-turn dependency across the dialogue. Finally, we implement a hybrid filtering strategy that combines manual annotation with model-assisted selection to validate alignment and correctness at scale. In addition to presenting the dataset, we evaluate the performance of several mainstream large language models and demonstrate that our end-to-end baseline model achieves competitive results. This work offers a new benchmark for studying executable semantic parsing and contextual understanding in map-based query tasks.- Anthology ID:
- 2026.findings-acl.36
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 750–771
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.36/
- DOI:
- Cite (ACL):
- Haodi Zhang, Xinrui Zhu, Mingze Kong, Zhidan Liu, Tao Fan, Kaishun Wu, and Yuanfeng Song. 2026. MTO: A Multi-turn Conversational Text-to-OverpassQL Dataset for Enhanced OpenStreetMap Query Generation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 750–771, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- MTO: A Multi-turn Conversational Text-to-OverpassQL Dataset for Enhanced OpenStreetMap Query Generation (Zhang et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.36.pdf