SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models

Kaushal Kumar Maurya, Kv Aditya Srivatsa, Ekaterina Kochmar


Abstract
Large language models (LLMs) have been widely adopted due to their remarkable performance across various applications, driving the accelerated development of a large number of diverse models. However, these individual LLMs show limitations in generalization and performance on complex tasks due to inherent training biases, model size constraints, and the quality or diversity of pre-training datasets. A promising direction is to efficiently harness the diverse capabilities of LLMs to overcome these individual limitations. To address these limitations, we introduce a novel LLM selection algorithm called SelectLLM, which efficiently directs input queries to the most suitable subset of LLMs from a large pool, ensuring that the selected models collectively provide accurate responses. SelectLLM employs a multi-label classifier and policy based on the classifier’s predictions and confidence scores in selecting an optimal, query-aware, and lightweight subset of LLMs. Our findings indicate that the proposed model outperforms existing ensemble-based baselines and achieves competitive performance with similarly sized top-performing LLMs while maintaining efficiency. Specifically, it achieves a huge reduction in inference latency on two challenging reasoning benchmarks: 13% on GSM8K and 70% on MMLU, compared to the top-performing baseline. Also, we establish a theoretical upper bound by an Oracle with LLMs and perform an in-depth linguistic analysis to understand the performance gap between the Oracle and SelectLLM.
Anthology ID:
2025.findings-acl.1072
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20847–20863
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.findings-acl.1072/
DOI:
Bibkey:
Cite (ACL):
Kaushal Kumar Maurya, Kv Aditya Srivatsa, and Ekaterina Kochmar. 2025. SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 20847–20863, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models (Maurya et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.findings-acl.1072.pdf