Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking

Chan-Wei Hu, Zhengzhong Tu


Abstract
Multi-modal retrieval-augmented generation (MM-RAG) relies heavily on re-rankers to surface the most relevant evidence for image-question queries. However, standard re-rankers typically process the full query image as a global embedding, making them susceptible to visual distractors (e.g., background clutter) that skew similarity scores.We propose **Region-R1**, a query-side region cropping framework that formulates region selection as a decision-making problem during re-ranking, allowing the system to learn to retain the full image or focus only on a question-relevant region before scoring the retrieved candidates. Region-R1 learns a policy with a novel region-aware group relative policy optimization (r-GRPO) to dynamically crop a discriminative region. Across two challenging benchmarks, E-VQA and InfoSeek, Region-R1 delivers consistent gains, achieving state-of-the-art performances by increasing conditional Recall@1 by up to 20%. These results show the great promise of query-side adaptation as a simple but effective way to strengthen MM-RAG re-ranking.
Anthology ID:
2026.findings-acl.510
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10492–10505
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.510/
DOI:
Bibkey:
Cite (ACL):
Chan-Wei Hu and Zhengzhong Tu. 2026. Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking. In Findings of the Association for Computational Linguistics: ACL 2026, pages 10492–10505, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking (Hu & Tu, Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.510.pdf
Checklist:
 2026.findings-acl.510.checklist.pdf