Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu, Enyan Dai


Abstract
Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces new security concern that adversaries may manipulate router to consistently select expensive high-capability models. Existing routing attacks depend either on white-box access or heuristic prompts, rendering them ineffective in real-world black-box scenarios. In this work, we propose R2A, which aims to mislead black-box LLM routers to expensive models via adversarial suffix optimization. Specifically, R2A deploys a hybrid ensemble surrogate router to mimic the black-box router. A suffix optimization algorithm is further adapted for the ensemble-based surrogate. Extensive experiments on multiple open-source and commercial routing systems demonstrate that R2A significantly increases the routing rate to expensive models on queries of different distributions. Code and examples: https://github.com/thcxiker/R2A-Attack.
Anthology ID:
2026.acl-long.2051
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
44333–44347
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.2051/
DOI:
Bibkey:
Cite (ACL):
Haochun Tang, Yuliang Yan, Jiahua Lu, Huaxiao Liu, and Enyan Dai. 2026. Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 44333–44347, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization (Tang et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.2051.pdf
Checklist:
 2026.acl-long.2051.checklist.pdf