Select-then-Route : Taxonomy guided Routing for LLMs

Soham Shah, Kumar Shridhar


Abstract
Recent advances in large language models (LLMs) have boosted performance across a broad spectrum of natural‐language tasks, yet no single model excels uniformly across domains. Sending each query to the most suitable model mitigates this limitation, but deciding among *all* available LLMs for each query is prohibitively expensive. Both the accuracy and the latency can improve if the decision space for the model choice is first narrowed, followed by selecting the suitable model for the given query.We introduce Select-then-Route (StR), a two‐stage framework that first *selects* a small, task‐appropriate pool of LLMs and then *routes* each query within that pool through an adaptive cascade. StR first employs a lightweight, *taxonomy‐guided selector* that maps each query to models proven proficient for its semantic class (e.g., reasoning, code, summarisation). Within the selected pool, a *confidence‐based cascade* begins with the cheapest model and escalates only when a multi‐judge agreement test signals low reliability.Across six public benchmarks of various domains, StR improves the end‐to‐end accuracy from 91.7% (best single model) to 94.3% while reducing inference cost by 4X. Because both the taxonomy and multi-judge evaluation thresholds are tunable, StR exposes a smooth cost–accuracy frontier, enabling users to dial in the trade‐off that best fits their latency and budget constraints.
Anthology ID:
2025.emnlp-industry.28
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
November
Year:
2025
Address:
Suzhou (China)
Editors:
Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
425–441
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.28/
DOI:
Bibkey:
Cite (ACL):
Soham Shah and Kumar Shridhar. 2025. Select-then-Route : Taxonomy guided Routing for LLMs. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 425–441, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):
Select-then-Route : Taxonomy guided Routing for LLMs (Shah & Shridhar, EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.28.pdf