FANS: Formal Answer Selection for LLM Natural Language Math Reasoning Using Lean4

Jiarui Yao, Ruida Wang, Tong Zhang


Abstract
Large Language Models (LLMs) have displayed astonishing abilities in various tasks, especially in text generation, classification, question answering, etc. However, the reasoning ability of LLMs still faces many debates, especially in math reasoning. The inherent ambiguity of Natural Language (NL) limits LLMs’ ability to perform verifiable reasoning, making the answers lack coherence and trustworthy support. To tackle the above challenges, we propose a novel framework named FANS: Formal ANswer Selection for LLM Natural Language Math Reasoning Using Lean4. It is a pioneering framework that utilizes Lean4 to enhance LLMs’ NL math reasoning ability. In particular, given an NL math question and LLM-generated answers, FANS first translates it into Lean4 theorem statements. Then it invokes another Lean4 prover LLM to produce proofs, and finally verifies the proofs by Lean4 compiler. Answers are selected based on the verifications. It enhances LLMs’ NL math ability in providing a computer-verifiable solution for its correct answer and proposes an alternative method for answer selection beyond the reward model based ones. Our experiments demonstrate the effectiveness of FANS with an improvement of nearly 2% across several math benchmarks, and even higher further based on reward models or in subfields such as algebra and number theory that Lean4 is better at. The code is available in https://github.com/MaxwellJryao/FANS.
Anthology ID:
2025.emnlp-main.158
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3181–3200
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.158/
DOI:
Bibkey:
Cite (ACL):
Jiarui Yao, Ruida Wang, and Tong Zhang. 2025. FANS: Formal Answer Selection for LLM Natural Language Math Reasoning Using Lean4. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 3181–3200, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
FANS: Formal Answer Selection for LLM Natural Language Math Reasoning Using Lean4 (Yao et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.158.pdf
Checklist:
 2025.emnlp-main.158.checklist.pdf