RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library
Jiapeng Wang, Jinhao Jiang, Zhiqiang Zhang, Jun Zhou, Xin Zhao
Abstract
The advancement of reasoning capabilities in Large Language Models (LLMs) requires substantial amounts of high-quality reasoning data, particularly in mathematics. Existing data synthesis methods, such as data augmentation from annotated training sets or direct question generation based on relevant knowledge points and documents, have expanded datasets but face challenges in mastering the internal logic of the problem during generation and ensuring the verifiability of the solutions. To address these issues, we propose RV-Syn, a novel Rational and Verifiable mathematical Synthesis approach. RV-Syn first constructs a structured library of mathematical operations and then composes them into executable computational graphs, which serve as verifiable solution blueprints. These graphs are subsequently back-translated into complex problems, enabling solution-guided, logic-aware problem generation while inherently ensuring the verifiability of the solving process. Experimental results show RV-Syn surpasses existing synthesis methods, including those involving human-crafted problems. Our method achieves a 6.3% performance gain over the previous state-of-the-art synthetic data on LLaMA-3-8B and demonstrates superior data efficiency, outperforming others with only half the training data (50k vs. 100k), enabling a more scalable and robust reasoning dataset generation framework.- Anthology ID:
- 2026.findings-eacl.93
- Volume:
- Findings of the Association for Computational Linguistics: EACL 2026
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Vera Demberg, Kentaro Inui, Lluís Marquez
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1812–1827
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.93/
- DOI:
- Cite (ACL):
- Jiapeng Wang, Jinhao Jiang, Zhiqiang Zhang, Jun Zhou, and Xin Zhao. 2026. RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library. In Findings of the Association for Computational Linguistics: EACL 2026, pages 1812–1827, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library (Wang et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.93.pdf