MirrorQA: Benchmarking Multimodal LLMs on Mirror-Orientation Reasoning
Jingping Liu, Xingchen Peng, Yan Zhou, Ziyan Liu, Jie Zhai, Ronghao Chen, Huacan Wang, Xiaofeng Jia
Abstract
Multimodal large language models (MLLMs) have achieved remarkable progress in recent years, yet their ability to perform left–right reasoning in mirror contexts—a fundamental element of spatial cognition—remains underexplored. To address this gap, we introduce MirrorQA, a manually constructed benchmark with 5,549 samples, designed to evaluate MLLMs’ capability to distinguish left from right from a subject-centered perspective. MirrorQA is built through a three-stage pipeline (annotation, verification, and final review) to ensure high-quality labeling. Comprehensive evaluations on both open- and closed-source MLLMs show that even the best-performing models achieve only 65.40% accuracy, far below the 99.28% accuracy of humans. These results highlight substantial challenges in current MLLMs when reasoning about left and right, and point to promising directions for future research. MirrorQA and its code are publicly available at anonymous link https://github.com/stargazer-zeno/MirrorQA.- Anthology ID:
- 2026.acl-long.1879
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 40464–40476
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1879/
- DOI:
- Cite (ACL):
- Jingping Liu, Xingchen Peng, Yan Zhou, Ziyan Liu, Jie Zhai, Ronghao Chen, Huacan Wang, and Xiaofeng Jia. 2026. MirrorQA: Benchmarking Multimodal LLMs on Mirror-Orientation Reasoning. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 40464–40476, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- MirrorQA: Benchmarking Multimodal LLMs on Mirror-Orientation Reasoning (Liu et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1879.pdf