Yan Xing

2026

While large language models (LLMs) have mastered syntax-level code generation, complex algorithmic reasoning remains a challenge, typically addressed by scaling model depth and parameter count. Universal Transformers (UT) offer a compelling alternative by introducing a recurrent inductive bias that aligns with the recursive nature of programming logic. However, training looped architectures at scale has historically been hindered by severe instability and optimization difficulties associated with backpropagation through time (BPTT). We present LoopCoder (40B-A80B) pre-trained on 12T+ code and general tokens, along with LoopCoder-Thinking and LoopCoder-Instruct variants—the first large-scale looped transformer for code, achieving comparable performance to standard dense architectures with more parameters. Unlike prior approaches that restrict recurrence to small-scale tasks, we implement a comprehensive looped training protocol spanning both pre-training and post-training phases. We initiate the model via dense-to-loop transformation, folding a pre-trained dense checkpoint to initialize a recurrent block, followed by rigorous looped pre-training and specialized post-training for instruction following and reasoning. Our results establish a robust recipe for scaling coding intelligence via recurrent computation, proving that dense checkpoints serve as an optimal foundation for evolving into dynamic, looped reasoners.

pdf bib abs

Fast Retrieval and Slow Reasoning for Explainable Multimodal Sentiment Analysis
Aoqiang Zhu | Min Hu | Yan Xing
Findings of the Association for Computational Linguistics: ACL 2026

Most existing Multimodal Sentiment Analysis (MSA) methods rely on holistic fusion, treating all modalities and temporal segments equally. Such strategies often introduce redundant information and obscure the decision process, limiting both robustness and interpretability. Inspired by dual-process theory, we propose FRSR (Fast Retrieval and Slow Reasoning), an interpretable framework that decomposes multimodal sentiment modeling into two cooperative pathways. The Fast Pathway acts as a lightweight evidence selector, using context-aware convolution and auxiliary supervision to retrieve a sparse set of Top-K sentiment-relevant cues from noisy multimodal inputs. Based on these cues, the Slow Pathway performs deeper cross-modal reasoning through learnable reasoning tokens, enabling hierarchical sentiment inference. By separating salient evidence retrieval from multimodal reasoning, FRSR improves interpretability while reducing computational cost. Experiments on three benchmark datasets show that FRSR achieves competitive performance, higher efficiency, stronger robustness to noise, and clearer decision transparency than existing holistic fusion methods.