Mixing Inference-time Experts for Enhancing LLM Reasoning

Soumya Sanyal; Tianyi Xiao; Xiang Ren

Mixing Inference-time Experts for Enhancing LLM Reasoning

Abstract

Large Language Models (LLMs) have demonstrated impressive reasoning abilities, but their generated rationales often suffer from issues such as reasoning inconsistency and factual errors, undermining their reliability. Prior work has explored improving rationale quality via multi-reward fine-tuning or reinforcement learning (RL), where models are optimized for diverse objectives. While effective, these approaches train the model in a fixed manner and do not have any inference-time adaptability, nor can they generalize reasoning requirements for new test-time inputs. Another approach is to train specialized reasoning experts using reward signals and use them to improve generation at inference time. Existing methods in this paradigm are limited to using only a single expert and cannot improve upon multiple reasoning aspects. To address this, we propose MIXIE, a novel inference-time expert-mixing framework that dynamically determines mixing proportions for each expert, enabling contextualized and flexible fusion. We demonstrate the effectiveness of MIXIE on improving chain-of-thought reasoning in LLMs by merging commonsense and entailment reasoning experts finetuned on reward-filtered data. Our approach outperforms existing baselines on three question-answering datasets: StrategyQA, CommonsenseQA, and ARC, highlighting its potential to enhance LLM reasoning with efficient, adaptable expert integration.

Anthology ID:: 2025.emnlp-main.1077
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21246–21260
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1077/
DOI:
Bibkey:
Cite (ACL):: Soumya Sanyal, Tianyi Xiao, and Xiang Ren. 2025. Mixing Inference-time Experts for Enhancing LLM Reasoning. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 21246–21260, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Mixing Inference-time Experts for Enhancing LLM Reasoning (Sanyal et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1077.pdf
Checklist:: 2025.emnlp-main.1077.checklist.pdf

PDF Cite Search Checklist Fix data