Union-of-Experts: Neurons in Mixture-of-Experts are Secretly Routers

Songhao Wu, Ang Lv, Ruobing Xie, Samm Sun, Di Wang, Rui Yan, Yankai Lin


Abstract
Mixture-of-Experts (MoE) models rely on an external router to assign tokens to experts. This design inherently separates the routing decision from each expert’s internal capabilities, leading to suboptimal performance. In this work, we address this limitation with Union-of-Experts (UoE), an MoE variant that performs "expert-autonomous routing”. The core mechanism of UoE is to pre-designate a minute fraction of neurons within each expert as "routing neurons”. Experts autonomously select relevant tokens by comparing the activation intensity of these neurons, aligning routing decisions with each expert’s functional profile. To prevent the waste of activations from unselected experts’ routing neurons, we aggregate all routing neuron outputs and sum them into the final layer output. This aggregation acts as a novel virtual shared expert whose parameters are distributed across the individual experts, and improves overall parameter efficiency. We pre-train UoE models with up to 3B parameters, demonstrating that they outperform traditional MoEs with matched efficiency. Furthermore, our analysis of the routing neurons provides valuable insights into expert-autonomous selection and the broader routing mechanisms of MoE models.
Anthology ID:
2026.acl-long.1675
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
36193–36206
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1675/
DOI:
Bibkey:
Cite (ACL):
Songhao Wu, Ang Lv, Ruobing Xie, Samm Sun, Di Wang, Rui Yan, and Yankai Lin. 2026. Union-of-Experts: Neurons in Mixture-of-Experts are Secretly Routers. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 36193–36206, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Union-of-Experts: Neurons in Mixture-of-Experts are Secretly Routers (Wu et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1675.pdf
Checklist:
 2026.acl-long.1675.checklist.pdf