SAIR-Comb : A Structure-Aware Iterative Refinement Framework for Combinatorics Autoformalization
Weijie Jiang, Gaolei He, Beibei Xiong, Jianlin Wang, Zhengfeng Yang
Abstract
Autoformalization aims to bridge the gap between human mathematical intuition and formal proof by automating the translation of informal reasoning into machine-verifiable languages. Despite significant breakthroughs catalyzed by Large Language Models (LLMs), autoformalizing Combinatorics remains a formidable challenge due to its intricate structural dependencies and the severe scarcity of high-quality formal datasets. To address these challenges, we propose SAIR-Comb, a Structure-Aware Iterative Refinement framework for Combinatorics powered by Lean 4 and LLMs. SAIR-Comb employs a multi-stage pipeline: first, it performs data augmentation and refinement by rectifying syntactic, semantic, and structural errors, guided by a curated manual combinatorics dataset. The model then undergoes a two-stage training regime: expert iteration with syntactic grounding, followed by reinforcement learning (RL) to align formal reasoning trajectories. Furthermore, we introduce Structural Consistency—a rigorous new metric designed to expose formalizing failures that elude traditional semantic-only evaluations. Experiments demonstrate that SAIR-Comb achieves strong performance on the specialized CombiBench while remaining highly competitive on general-domain benchmarks, including PutnamBench and ProverBench.- Anthology ID:
- 2026.acl-long.1075
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 23452–23473
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1075/
- DOI:
- Cite (ACL):
- Weijie Jiang, Gaolei He, Beibei Xiong, Jianlin Wang, and Zhengfeng Yang. 2026. SAIR-Comb : A Structure-Aware Iterative Refinement Framework for Combinatorics Autoformalization. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 23452–23473, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- SAIR-Comb : A Structure-Aware Iterative Refinement Framework for Combinatorics Autoformalization (Jiang et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1075.pdf