Predicate-Guided Generation for Mathematical Reasoning

Jiajun Chen, Yik-Cheung Tam


Abstract
We present Prolog-MATH, a curated corpus designed to support mathematical reasoning in large language models (LLMs) through logic programming. Each verbal math problem in the dataset is paired with a chain-of-thought explanation to generate Prolog program via a two-stage automated pipeline. In the first stage, an LLM (e.g., Deepseek-V3) predicts a set of relevant mathematical predicates that could be useful in solving the problem. In the second stage, the LLM uses these suggested predicates along with the expected answer type to gen- erate a complete Prolog program. To improve coverage, we fine-tune an open-source LLM us- ing supervised fine-tuning, followed by GRPO (Group Relative Policy Optimization) training to address problems that Deepseek-V3 fails to solve. To support this training, we propose a predicate-aware reward function that evaluates how well the generated solution incorporates the suggested predicates, complementing the standard binary reward. Experimental results show that: 1) Our two-stage pipeline achieves 81.3% solution coverage on the MATH training set; 2) GRPO training with the predicate-aware reward function enables a series of base models to correctly solve additional problems missed by Deepseek-V3, further increasing solution coverage to 97.4%. Data and source code can be obtained at the Github repository.
Anthology ID:
2025.emnlp-main.462
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9097–9110
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.462/
DOI:
Bibkey:
Cite (ACL):
Jiajun Chen and Yik-Cheung Tam. 2025. Predicate-Guided Generation for Mathematical Reasoning. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 9097–9110, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Predicate-Guided Generation for Mathematical Reasoning (Chen & Tam, EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.462.pdf
Checklist:
 2025.emnlp-main.462.checklist.pdf