Compositional Steering of Large Language Models with Steering Tokens

Gorjan Radevski, Kiril Gashteovski, Giwon Hong, Carolin Lawrence, Goran Glava\v{s}


Abstract
Deploying LLMs in real-world applications requires controllable output that satisfies multiple desiderata at the same time. While existing work extensively addresses LLM steering for a single behavior, compositional steering—i.e., steering LLMs simultaneously towards multiple behaviors—remains an underexplored problem. In this work, we propose compositional steering tokens for multi-behavior steering. We first embed individual behaviors, expressed as natural language instructions, into dedicated tokens via self-distillation. Contrary to most prior work, which operates in the activation space, our behavior steers live in the space of input tokens, enabling more effective zero-shot composition. We then train a dedicated composition token on pairs of behaviors and show that it successfully captures the notion of composition: it generalizes well to unseen compositions, including those with unseen behaviors as well as those with an unseen number of behaviors. Our experiments across different LLM architectures show that steering tokens lead to superior multi-behavior steering of verifiable constraints (e.g., length, format, structure, language) compared to competing approaches (instructions, activation steering, and LoRA merging). Moreover, we show that steering tokens complement natural language instructions, with their combination resulting in further gains.
Anthology ID:
2026.acl-long.1435
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31087–31104
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1435/
DOI:
Bibkey:
Cite (ACL):
Gorjan Radevski, Kiril Gashteovski, Giwon Hong, Carolin Lawrence, and Goran Glava\v{s}. 2026. Compositional Steering of Large Language Models with Steering Tokens. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 31087–31104, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Compositional Steering of Large Language Models with Steering Tokens (Radevski et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1435.pdf
Checklist:
 2026.acl-long.1435.checklist.pdf