Liangcai Gao

2025

In this paper, we propose a new data synthesis method called LogicPro, which leverages LeetCode-style algorithm Problems and their corresponding Program solutions to synthesize Complex Logical Reasoning data in text format. First, we synthesize complex reasoning problems through source algorithm problems and test cases. Then, standard answers and intermediate variable outputs are obtained for each problem based on standard python solutions and test cases. Finally, with the guidance of code intermediate variables, we synthesize the text reasoning process for each reasoning problems. Through this method, we can synthesize data that is difficult, scalable, effective, and comes with golden standard answers and high-quality reasoning processes. As a result, with our 540K synthesized dataset constructed solely from 2,360 algorithm problems, our approach achieves significant improvements in multiple models for the datasets BBH^27, LogicBench, DROP, AR-LSAT, and GSM8K, etc. outperforming a wide range of existing reasoning datasets.

2023

pdf bib abs
GeoDRL: A Self-Learning Framework for Geometry Problem Solving using Reinforcement Learning in Deductive Reasoning
Shuai Peng | Di Fu | Yijun Liang | Liangcai Gao | Zhi Tang
Findings of the Association for Computational Linguistics: ACL 2023

Ensuring both interpretability and correctness is a great challenge in automated geometry problem solving (GPS), and the scarcity of labeled data hinders learning mathematical reasoning from samples. Therefore, we present GeoDRL, a self-learning geometry problem solving framework that integrates logic graph deduction and Deep Reinforcement Learning (DRL) to optimize geometry reasoning as a Markov Decision Process. GeoDRL employs a Graph Neural Network on a Geometry Logic Graph, updating the problem state using a symbolic system. Incorporating DRL into deductive reasoning enables GeoDRL to achieve unsupervised self-learning while maintaining correctness. GeoDRL, through unsupervised learning, exhibits enhanced accuracy in the Geometry3K dataset, improving by 11.1% over previous SOTA methods, and simultaneously boosts efficiency and interpretability.

Co-authors

Venues

acl1
findings1

Fix author