To Intervene or Not: Guiding Inference-time Alignment with Probabilistic Model Blending

Jin Gan; Xin Li; Jun Luo

To Intervene or Not: Guiding Inference-time Alignment with Probabilistic Model Blending

Abstract

The wide deployment of LLMs has made model alignment necessary to make newly trained models safely and effectively respond to user instructions. Among different methods, inference-time alignment is often cheaper as it intervenes (i.e., offers guidances) only during output generation. Existing proposals apply guidances extracted from certain aligned models without properly assessing their reliability. Nonetheless, our systematic evaluation reveals that guidance effectiveness varies drastically across models; since ineffective guidances lead to further confusion and thus further interventions, the resulting excessive interventions typically indicate poor performance. To make interventions more effective and thus more efficient, we introduce BlendIn, an inference-time alignment framework that shifts from binary decisions to creating hybrid distributions integrating both models’ knowledge. BlendIn stabilizes inference-time alignment by performing quality-aware alignment and proportionally weighting each model’s contribution based on reliability. Compared with existing works, it preserves beneficial guidance while downweighting unreliable suggestions. BlendIn provides both diagnostic signals and mitigation strategies for misaligned guidance, achieving consistent and up to 50% performance improvement on challenging model pairs. Our code is available at: https://github.com/DecayingSeart/BlendIn.

Anthology ID:: 2026.findings-acl.847
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 17158–17169
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.847/
DOI:
Bibkey:
Cite (ACL):: Jin Gan, Xin Li, and Jun Luo. 2026. To Intervene or Not: Guiding Inference-time Alignment with Probabilistic Model Blending. In Findings of the Association for Computational Linguistics: ACL 2026, pages 17158–17169, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: To Intervene or Not: Guiding Inference-time Alignment with Probabilistic Model Blending (Gan et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.847.pdf
Checklist:: 2026.findings-acl.847.checklist.pdf

PDF Cite Search Checklist Fix data