CARES: Context-Aware Resolution Selector for VLMs
Moshe Kimhi, Nimrod Shabtay, Raja Giryes, Chaim Baskin, Eli Schwartz
Abstract
Large vision–language models (VLMs) commonly process images at native or high resolution to remain effective across tasks. This inflates visual tokens to 97-99% of total tokens, resulting in high compute and latency, even when low-resolution images would suffice. We introduce CARES—a Context-Aware Resolution Selector, a lightweight preprocessing module that, given an image–query pair, predicts the minimal sufficient input resolution. CARES uses a compact VLM (350M) to extract features and predict when a target pretrained VLM’s response converges to its peak ability to answer correctly. Though trained as a discrete classifier over a set of optional resolutions, CARES interpolates continuous resolutions at inference for fine-grained control. Across five multimodal benchmarks spanning documents and natural images, as well as diverse target VLMs, CARES preserves task performance while reducing compute by up to 80%.- Anthology ID:
- 2026.acl-long.102
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2243–2256
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.102/
- DOI:
- Cite (ACL):
- Moshe Kimhi, Nimrod Shabtay, Raja Giryes, Chaim Baskin, and Eli Schwartz. 2026. CARES: Context-Aware Resolution Selector for VLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2243–2256, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- CARES: Context-Aware Resolution Selector for VLMs (Kimhi et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.102.pdf