The Side Effects of Being Smart: Safety Risks in MLLMs’ Multi-Image Reasoning

Renmiao Chen; Yida Lu; Shiyao Cui; Xuan Ouyang; Victor Shea-Jay Huang; Shumin Zhang; Chengwei Pan; Han Qiu; Minlie Huang

The Side Effects of Being Smart: Safety Risks in MLLMs’ Multi-Image Reasoning

Renmiao Chen, Yida Lu, Shiyao Cui, Xuan Ouyang, Victor Shea-Jay Huang, Shumin Zhang, Chengwei Pan, Han Qiu, Minlie Huang

Abstract

As Multimodal Large Language Models (MLLMs) acquire stronger reasoning capabilities to handle complex, multi-image instructions, this advancement may pose new safety risks. We study this problem by introducing MIR-SafetyBench, the first benchmark focused on multi-image reasoning safety, which consists of 2,676 instances across a taxonomy of 9 multi-image relations. Our extensive evaluations on 19 MLLMs reveal a troubling trend: models with more advanced multi-image reasoning can be more vulnerable on MIR-SafetyBench. Beyond attack success rates, we find that many responses labeled as safe are superficial, often driven by misunderstanding or evasive, non-committal replies. We further observe that unsafe generations exhibit lower attention entropy than safe ones on average. This internal signature suggests a possible risk that models may over-focus on task solving while neglecting safety constraints.

Anthology ID:: 2026.acl-long.1710
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 36866–36882
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1710/
DOI:
Bibkey:
Cite (ACL):: Renmiao Chen, Yida Lu, Shiyao Cui, Xuan Ouyang, Victor Shea-Jay Huang, Shumin Zhang, Chengwei Pan, Han Qiu, and Minlie Huang. 2026. The Side Effects of Being Smart: Safety Risks in MLLMs’ Multi-Image Reasoning. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 36866–36882, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: The Side Effects of Being Smart: Safety Risks in MLLMs’ Multi-Image Reasoning (Chen et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1710.pdf
Checklist:: 2026.acl-long.1710.checklist.pdf

PDF Cite Search Checklist Fix data