Ran Tao
2026
LoopCoder: Scaling Code Intelligence via Looped Language Models
Jian Yang | Wei Zhang | Shuyue Guo | Yizhi LI | Linzheng Chai | Zhengmao Ye | Shukai Liu | Yuyang Song | Jiajun Wu | Che Liu | Tianyu Zheng | Siwei Wu | Leo L | Xudong Ma | Chuan Hao | Ran Tao | Yan Xing | Jianzhou Wang | Mingjie Tang | Aishan Liu | Zhoujun Li | Xianglong Liu | Weifeng Lv | Bryan Dai
Findings of the Association for Computational Linguistics: ACL 2026
Jian Yang | Wei Zhang | Shuyue Guo | Yizhi LI | Linzheng Chai | Zhengmao Ye | Shukai Liu | Yuyang Song | Jiajun Wu | Che Liu | Tianyu Zheng | Siwei Wu | Leo L | Xudong Ma | Chuan Hao | Ran Tao | Yan Xing | Jianzhou Wang | Mingjie Tang | Aishan Liu | Zhoujun Li | Xianglong Liu | Weifeng Lv | Bryan Dai
Findings of the Association for Computational Linguistics: ACL 2026
While large language models (LLMs) have mastered syntax-level code generation, complex algorithmic reasoning remains a challenge, typically addressed by scaling model depth and parameter count. Universal Transformers (UT) offer a compelling alternative by introducing a recurrent inductive bias that aligns with the recursive nature of programming logic. However, training looped architectures at scale has historically been hindered by severe instability and optimization difficulties associated with backpropagation through time (BPTT). We present LoopCoder (40B-A80B) pre-trained on 12T+ code and general tokens, along with LoopCoder-Thinking and LoopCoder-Instruct variants—the first large-scale looped transformer for code, achieving comparable performance to standard dense architectures with more parameters. Unlike prior approaches that restrict recurrence to small-scale tasks, we implement a comprehensive looped training protocol spanning both pre-training and post-training phases. We initiate the model via dense-to-loop transformation, folding a pre-trained dense checkpoint to initialize a recurrent block, followed by rigorous looped pre-training and specialized post-training for instruction following and reasoning. Our results establish a robust recipe for scaling coding intelligence via recurrent computation, proving that dense checkpoints serve as an optimal foundation for evolving into dynamic, looped reasoners.
Polymorphic Universal Transformer
Yilong Chen | Zitian Gao | Yihao Xiao | Jason Klein Liu | Xinyu Yang | Yifan Luo | Haoming Luo | Zhengmao Ye | Tingwen Liu | Ran Tao | Bryan Dai
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yilong Chen | Zitian Gao | Yihao Xiao | Jason Klein Liu | Xinyu Yang | Yifan Luo | Haoming Luo | Zhengmao Ye | Tingwen Liu | Ran Tao | Bryan Dai
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Although the Universal Transformer (UT) mitigates the diminishing returns of standard LLM scaling by decoupling parameter count from depth, it remains constrained by linear computational costs and rigid weight-sharing mechanisms. These limitations lead to severe functional homogeneity, which subsequently induces over-smoothing, representation rank collapse, and degraded reasoning performance. In this work, we present the first systematic study of Compute Distribution Skew, identifying it as the primary driver of extrapolation failure. This is a pathological phenomenon in ultra-deep recurrent Transformers characterized by a disproportionate distribution of contributions across recurrent steps, resulting in distinct functional states during prefix and suffix processing phases. To address this challenge, we propose the Polymorphic Transformer, which aims to achieve functional polymorphism and depth sparsity within a shared-parameter framework. By integrating conditional sparse subspaces, SiLU Attention, and an uncertainty-aware depth scheduler, our architecture mitigates power-method collapse and effectively decouples logical depth from computational cost. Experiments demonstrate that our model significantly enhances representation rank and robustness, achieving complex reasoning performance comparable to baseline while reducing computation by 64.7%.
2025
Speech-Like Cues and the Limits of Musicality: Lexical Tone Normalization in Mandarin across Speech, Rap, and Song Contexts
Yujia Tian | Yanyuan Ye | Mingxi Lu | Fanlu Jia | Ran Tao
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
Yujia Tian | Yanyuan Ye | Mingxi Lu | Fanlu Jia | Ran Tao
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
Effect of Emotional Congruency and Cognitive Load on Word Processing
Jieyu Chen | Yujia Tian | Jing Qi | Ran Tao
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
Jieyu Chen | Yujia Tian | Jing Qi | Ran Tao
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
2024
Effect of Rap Music Context on Lexical Tone Normalization
Yujia Tian | Yanyuan Ye | Mingxi Lu | Fanlu Jia | Ran Tao
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
Yujia Tian | Yanyuan Ye | Mingxi Lu | Fanlu Jia | Ran Tao
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
The Influence of Language on Personality Traits: A Multi-modal Study Among Chinese-English Bilinguals
Mingxi Lu | Ran Tao
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
Mingxi Lu | Ran Tao
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
Mandarin speakers prefer explicit visual cues in learning Cantonese tones: an eye-tracking study
Yuqin Shu | Yi Weng | Ran Tao | Gang Peng
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
Yuqin Shu | Yi Weng | Ran Tao | Gang Peng
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation
2020
Search
Fix author
Co-authors
- Mingxi Lu 3
- Yujia Tian 3
- Bryan Dai 2
- Fanlu Jia 2
- Gang Peng 2
- Yanyuan Ye 2
- Zhengmao Ye 2
- Linzheng Chai 1
- Jieyu Chen 1
- YiLong Chen 1
- Zitian Gao 1
- Shuyue Guo 1
- Chuan Hao 1
- Leo L 1
- Yizhi Li 1
- Zhoujun Li 1
- Aishan Liu 1
- Che Liu 1
- Jason Klein Liu 1
- Shukai Liu 1
- Tingwen Liu 1
- Xianglong Liu 1
- Haoming Luo 1
- Yifan Luo 1
- Weifeng Lv 1
- Xudong Ma 1
- Jing Qi 1
- Yuqin Shu 1
- Yuyang Song 1
- Mingjie Tang 1
- Jianzhou Wang 1
- Yi Weng 1
- Jiajun Wu 1
- Siwei Wu 1
- Yihao Xiao 1
- Yan Xing 1
- Jian Yang 1
- Xinyu Yang 1
- Wei Zhang 1
- Tianyu Zheng 1