Di Wang
Other people with similar names: Di Wang, Di Wang, Di Wang, Di Wang, Di Wang
Unverified author pages with similar names: Di Wang
2026
Union-of-Experts: Neurons in Mixture-of-Experts are Secretly Routers
Songhao Wu | Ang Lv | Ruobing Xie | Samm Sun | Di Wang | Rui Yan | Yankai Lin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Songhao Wu | Ang Lv | Ruobing Xie | Samm Sun | Di Wang | Rui Yan | Yankai Lin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Mixture-of-Experts (MoE) models rely on an external router to assign tokens to experts. This design inherently separates the routing decision from each expert’s internal capabilities, leading to suboptimal performance. In this work, we address this limitation with Union-of-Experts (UoE), an MoE variant that performs "expert-autonomous routing”. The core mechanism of UoE is to pre-designate a minute fraction of neurons within each expert as "routing neurons”. Experts autonomously select relevant tokens by comparing the activation intensity of these neurons, aligning routing decisions with each expert’s functional profile. To prevent the waste of activations from unselected experts’ routing neurons, we aggregate all routing neuron outputs and sum them into the final layer output. This aggregation acts as a novel virtual shared expert whose parameters are distributed across the individual experts, and improves overall parameter efficiency. We pre-train UoE models with up to 3B parameters, demonstrating that they outperform traditional MoEs with matched efficiency. Furthermore, our analysis of the routing neurons provides valuable insights into expert-autonomous selection and the broader routing mechanisms of MoE models.
Reinforcement Learning on Pre-Training Data
Siheng Li | Kejiao Li | Zenan Xu | Guanhua Huang | Kun Li | Haoyuan Wu | Wujiajia | Zihao Zheng | Chenchen Zhang | Kun Shi | Xue Gong | Qi Yi | Ruibin Xiong | Tingqiang Xu | Yuhao Jiang | Jianfeng Yan | Yuyuan Zeng | Guanghui Xu | Jinbao Xue | Zhijiang xu | Zheng Fang | Shuai LI | Qibin Liu | Xiaoxue Li | Zhuoyu Li | Yangyu Tao | Fei Gao | Cheng Jiang | Bochao Wang | Kai Liu | Jianchen Zhu | Wai Lam | Bo Zhou | Di Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Siheng Li | Kejiao Li | Zenan Xu | Guanhua Huang | Kun Li | Haoyuan Wu | Wujiajia | Zihao Zheng | Chenchen Zhang | Kun Shi | Xue Gong | Qi Yi | Ruibin Xiong | Tingqiang Xu | Yuhao Jiang | Jianfeng Yan | Yuyuan Zeng | Guanghui Xu | Jinbao Xue | Zhijiang xu | Zheng Fang | Shuai LI | Qibin Liu | Xiaoxue Li | Zhuoyu Li | Yangyu Tao | Fei Gao | Cheng Jiang | Bochao Wang | Kai Liu | Jianchen Zhu | Wai Lam | Bo Zhou | Di Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent progress in large language models (LLMs) is largely driven by scaling training compute through either pre-training with next-token prediction (NTP) or post-training with reinforcement learning (RL). The former contributes to learning broad knowledge and skills from general data, while struggling with data inefficiency and catastrophic forgetting in continual learning settings. The latter incentivizes reasoning capabilities with strong generalization, but is constrained by limited data availability due to its reliance on human annotation. To alleviate these issues, we propose Reinforcement Learning on Pre-Training data (RLPT), which combines the advantages of learning from general data and RL. In particular, RLPT derives reward signals directly from general text data through a next-segment reasoning objective, rewarding the policy for correctly predicting next text segments conditioned on the prefix text. Experiments across multiple benchmarks and models demonstrate the effectiveness of . For example, RLPT yields substantial improvements in continual pre-training (+4.6%) and provides a strong foundation for post-training (+3.4%) on Qwen3-8B-Base.
Search
Fix author
Co-authors
- Zheng Fang 1
- Fei Gao 1
- Xue Gong 1
- Guanhua Huang 1
- Yuhao Jiang 1
- Cheng Jiang 1
- Shuai LI 1
- Wai Lam 1
- Siheng Li 1
- Kejiao Li 1
- Kun Li 1
- Xiaoxue Li 1
- Zhuoyu Li 1
- Yankai Lin (林衍凯) 1
- Qibin Liu 1
- Kai Liu 1
- Ang Lv 1
- Kun Shi 1
- Xingwu Sun 1
- Yangyu Tao 1
- Bochao Wang 1
- Songhao Wu 1
- Haoyuan Wu 1
- Wujiajia 1
- Ruobing Xie 1
- Ruibin Xiong 1
- Zenan Xu 1
- Tingqiang Xu 1
- Guanghui Xu 1
- Jinbao Xue 1
- Rui Yan 1
- Jianfeng Yan 1
- Qi Yi 1
- Yuyuan Zeng 1
- Chenchen Zhang 1
- Zihao Zheng 1
- Bo Zhou 1
- Jianchen Zhu 1
- Zhijiang xu 1
Venues
- ACL2