UniMath-CoT: A Unified Framework for Multimodal Mathematical Reasoning with Re-Inference Affirmation

Zhixiang Lu; Mian Zhou; Angelos Stefanidis; Jionglong Su

UniMath-CoT: A Unified Framework for Multimodal Mathematical Reasoning with Re-Inference Affirmation

Zhixiang Lu, Mian Zhou, Angelos Stefanidis, Jionglong Su

Abstract

Large Language Models (LLMs) have achieved considerable success in text-based mathematical reasoning, yet their potential remains underexplored in the multimodal mathematics domain where joint text and image understanding is imperative. A key bottleneck hindering progress is the scarcity of high-quality, genuinely multimodal benchmarks. To address this gap, we construct a unified benchmark by consolidating and curating three public multimodal mathematics datasets. We subsequently propose the UniMath-CoT framework, which establishes a robust performance baseline by combining Chain-of-Thought (CoT) principles with efficient Supervised Fine-Tuning (SFT) based on Low-Rank Adaptation (LoRA). Furthermore, to bolster the model’s reasoning robustness, we introduce an innovative verification mechanism, AARI (Answer Affirmation by Re-Inference), which leverages a specialized re-inference protocol to have the model self-scrutinize and validate its initial conclusions. Our comprehensive experiments show that this integrated strategy substantially boosts performance, surpassing a wide range of open-source models and markedly closing the gap with leading proprietary systems.

Anthology ID:: 2025.mathnlp-main.13
Volume:: Proceedings of The 3rd Workshop on Mathematical Natural Language Processing (MathNLP 2025)
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Marco Valentino, Deborah Ferreira, Mokanarangan Thayaparan, Leonardo Ranaldi, Andre Freitas
Venues:: MathNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 176–185
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.mathnlp-main.13/
DOI:
Bibkey:
Cite (ACL):: Zhixiang Lu, Mian Zhou, Angelos Stefanidis, and Jionglong Su. 2025. UniMath-CoT: A Unified Framework for Multimodal Mathematical Reasoning with Re-Inference Affirmation. In Proceedings of The 3rd Workshop on Mathematical Natural Language Processing (MathNLP 2025), pages 176–185, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: UniMath-CoT: A Unified Framework for Multimodal Mathematical Reasoning with Re-Inference Affirmation (Lu et al., MathNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.mathnlp-main.13.pdf

PDF Cite Search Fix data