Chaim Baskin
2026
CARES: Context-Aware Resolution Selector for VLMs
Moshe Kimhi | Nimrod Shabtay | Raja Giryes | Chaim Baskin | Eli Schwartz
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Moshe Kimhi | Nimrod Shabtay | Raja Giryes | Chaim Baskin | Eli Schwartz
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large vision–language models (VLMs) commonly process images at native or high resolution to remain effective across tasks. This inflates visual tokens to 97-99% of total tokens, resulting in high compute and latency, even when low-resolution images would suffice. We introduce CARES—a Context-Aware Resolution Selector, a lightweight preprocessing module that, given an image–query pair, predicts the minimal sufficient input resolution. CARES uses a compact VLM (350M) to extract features and predict when a target pretrained VLM’s response converges to its peak ability to answer correctly. Though trained as a discrete classifier over a set of optional resolutions, CARES interpolates continuous resolutions at inference for fine-grained control. Across five multimodal benchmarks spanning documents and natural images, as well as diverse target VLMs, CARES preserves task performance while reducing compute by up to 80%.
2025
Jailbreak Attack Initializations as Extractors of Compliance Directions
Amit LeVi | Rom Himelstein | Yaniv Nemcovsky | Avi Mendelson | Chaim Baskin
Findings of the Association for Computational Linguistics: EMNLP 2025
Amit LeVi | Rom Himelstein | Yaniv Nemcovsky | Avi Mendelson | Chaim Baskin
Findings of the Association for Computational Linguistics: EMNLP 2025
Safety-aligned LLMs respond to prompts with either compliance or refusal, each corresponding to distinct directions in the model’s activation space. Recent studies have shown that initializing attacks via self-transfer from other prompts significantly enhances their performance. However, the underlying mechanisms of these initializations remain unclear, and attacks utilize arbitrary or hand-picked initializations. This work presents that each gradient-based jailbreak attack and subsequent initialization gradually converge to a single compliance direction that suppresses refusal, thereby enabling an efficient transition from refusal to compliance. Based on this insight, we propose CRI, an initialization framework that aims to project unseen prompts further along compliance directions. We demonstrate our approach on multiple attacks, models, and datasets, achieving an increased attack success rate (ASR) and reduced computational overhead, highlighting the fragility of safety-aligned LLMs.