Skeletons Matter: Dynamic Data Augmentation for Text-to-Query

Yuchen Ji, Bo Xu, Jie Shi, Jiaqing Liang, Deqing Yang, Yu Mao, Hai Chen, Yanghua Xiao


Abstract
The task of translating natural language questions into query languages has long been a central focus in semantic parsing. Recent advancements in Large Language Models (LLMs) have significantly accelerated progress in this field. However, existing studies typically focus on a single query language, resulting in methods with limited generalizability across different languages. In this paper, we formally define the Text-to-Query task paradigm, unifying semantic parsing tasks across various query languages. We identify query skeletons as a shared optimization target of Text-to-Query tasks, and propose a general dynamic data augmentation framework that explicitly diagnoses model-specific weaknesses in handling these skeletons to synthesize targeted training data. Experiments on four Text-to-Query benchmarks demonstrate that our method achieves state-of-the-art performance using only a small amount of synthesized data, highlighting the efficiency and generality of our approach and laying a solid foundation for unified research on Text-to-Query tasks. We release our code at https://github.com/jjjycaptain/Skeletron
Anthology ID:
2025.emnlp-main.64
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1214–1236
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.64/
DOI:
Bibkey:
Cite (ACL):
Yuchen Ji, Bo Xu, Jie Shi, Jiaqing Liang, Deqing Yang, Yu Mao, Hai Chen, and Yanghua Xiao. 2025. Skeletons Matter: Dynamic Data Augmentation for Text-to-Query. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 1214–1236, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Skeletons Matter: Dynamic Data Augmentation for Text-to-Query (Ji et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.64.pdf
Checklist:
 2025.emnlp-main.64.checklist.pdf