Mian Zhou


Fixing paper assignments

  1. Please select all papers that do not belong to this person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
UniMath-CoT: A Unified Framework for Multimodal Mathematical Reasoning with Re-Inference Affirmation
Zhixiang Lu | Mian Zhou | Angelos Stefanidis | Jionglong Su
Proceedings of The 3rd Workshop on Mathematical Natural Language Processing (MathNLP 2025)

Large Language Models (LLMs) have achieved considerable success in text-based mathematical reasoning, yet their potential remains underexplored in the multimodal mathematics domain where joint text and image understanding is imperative. A key bottleneck hindering progress is the scarcity of high-quality, genuinely multimodal benchmarks. To address this gap, we construct a unified benchmark by consolidating and curating three public multimodal mathematics datasets. We subsequently propose the UniMath-CoT framework, which establishes a robust performance baseline by combining Chain-of-Thought (CoT) principles with efficient Supervised Fine-Tuning (SFT) based on Low-Rank Adaptation (LoRA). Furthermore, to bolster the model’s reasoning robustness, we introduce an innovative verification mechanism, AARI (Answer Affirmation by Re-Inference), which leverages a specialized re-inference protocol to have the model self-scrutinize and validate its initial conclusions. Our comprehensive experiments show that this integrated strategy substantially boosts performance, surpassing a wide range of open-source models and markedly closing the gap with leading proprietary systems.

2018

pdf bib
Generating Description for Sequential Images with Local-Object Attention Conditioned on Global Semantic Context
Jing Su | Chenghua Lin | Mian Zhou | Qingyun Dai | Haoyu Lv
Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)