PuzzleClone: A DSL-Powered Framework for Synthesizing Verifiable Data

Kai Xiong; Yanwei Huang; Rongjunchen Zhang; Kun Chen; Haipang Wu; Yingcai Wu

PuzzleClone: A DSL-Powered Framework for Synthesizing Verifiable Data

Kai Xiong, Yanwei Huang, Rongjunchen Zhang, Kun Chen, Haipang WU, Yingcai Wu

Abstract

High-quality mathematical and logical datasets with verifiable answers are essential for strengthening the reasoning capabilities of large language models (LLMs). While recent data augmentation techniques have facilitated the creation of large-scale benchmarks, existing LLM-generated datasets often suffer from limited reliability, diversity, and scalability. To address these challenges, we introduce PuzzleClone, a formal framework for synthesizing verifiable data at scale using a novel DSL-driven approach. Our approach features three key innovations: (1) encoding seed puzzles into structured logical specifications, (2) generating scalable variants through systematic variable and constraint randomization, and (3) ensuring validity via a reproduction mechanism. Applying PuzzleClone, we construct PC-83K, a benchmark comprising over 83K diverse and programmatically validated puzzles. The generated puzzles span a wide spectrum of difficulty and formats, posing significant challenges to current state-of-the-art models. Experimental results show that post training (SFT and RL) on PC-83K yields substantial improvements not only on the testset but also on various logic and mathematical benchmarks. Post training raises average performance on PC-83K from 14.5 to 66.0 and delivers consistent improvements across 7 logic and mathematical benchmarks up to 18.4 absolute percentage points (SATBench from 51.6 to 70.0). Our code and data are available at https://github.com/HiThink-Research/PuzzleClone.

Anthology ID:: 2026.findings-acl.1669
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33378–33404
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1669/
DOI:
Bibkey:
Cite (ACL):: Kai Xiong, Yanwei Huang, Rongjunchen Zhang, Kun Chen, Haipang WU, and Yingcai Wu. 2026. PuzzleClone: A DSL-Powered Framework for Synthesizing Verifiable Data. In Findings of the Association for Computational Linguistics: ACL 2026, pages 33378–33404, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: PuzzleClone: A DSL-Powered Framework for Synthesizing Verifiable Data (Xiong et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1669.pdf
Checklist:: 2026.findings-acl.1669.checklist.pdf

PDF Cite Search Checklist Fix data