Mian Zhou
2025
UniMath-CoT: A Unified Framework for Multimodal Mathematical Reasoning with Re-Inference Affirmation
Zhixiang Lu
|
Mian Zhou
|
Angelos Stefanidis
|
Jionglong Su
Proceedings of The 3rd Workshop on Mathematical Natural Language Processing (MathNLP 2025)
Large Language Models (LLMs) have achieved considerable success in text-based mathematical reasoning, yet their potential remains underexplored in the multimodal mathematics domain where joint text and image understanding is imperative. A key bottleneck hindering progress is the scarcity of high-quality, genuinely multimodal benchmarks. To address this gap, we construct a unified benchmark by consolidating and curating three public multimodal mathematics datasets. We subsequently propose the UniMath-CoT framework, which establishes a robust performance baseline by combining Chain-of-Thought (CoT) principles with efficient Supervised Fine-Tuning (SFT) based on Low-Rank Adaptation (LoRA). Furthermore, to bolster the model’s reasoning robustness, we introduce an innovative verification mechanism, AARI (Answer Affirmation by Re-Inference), which leverages a specialized re-inference protocol to have the model self-scrutinize and validate its initial conclusions. Our comprehensive experiments show that this integrated strategy substantially boosts performance, surpassing a wide range of open-source models and markedly closing the gap with leading proprietary systems.
2018
Generating Description for Sequential Images with Local-Object Attention Conditioned on Global Semantic Context
Jing Su
|
Chenghua Lin
|
Mian Zhou
|
Qingyun Dai
|
Haoyu Lv
Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)
Search
Fix author
Co-authors
- Qingyun Dai 1
- Chenghua Lin 1
- Zhixiang Lu 1
- Haoyu Lv 1
- Angelos Stefanidis 1
- show all...