Shurui Li
2025
Enhanced Data Synthesis for LLM through Reasoning Structures Generated by Hierarchical GFlowNet
Tianpeng Bu
|
Minying Zhang
|
Hongtao Duan
|
Shurui Li
|
Lulu Hu
|
Yu Li
Findings of the Association for Computational Linguistics: ACL 2025
Large language models (LLMs) excel in problem-solving but require training data with diverse reasoning processes. Existing methods mainly optimize instruction-response pairs but lack a systematic design for the underlying reasoning structure. This paper proposes RSS: a Reasoning Structure driven data Synthesis method. We first proactively develop a hierarchical GFlowNet to construct reasoning structures efficiently through a coarse-to-fine directed acyclic graph (DAG) growth process. Then reasoning DAGs are leveraged to actively guide the instruction generation via an iterative suggester-editor workflow and enhance response quality using a structure-aware strategy. Experiments show that LLMs trained on our synthetic datasets achieve 48.50%, 84.00%, 79.90% for AlpacaEval2, GSM8K and HumanEval, outperforming existing data synthesis methods.