Yang Liu

Other people with similar names: Yang Janet Liu (Georgetown University; 刘洋), Yang Liu (Tsinghua), Yang Liu (Fudan), Yang Liu (BIGAI), Yang Liu, Yang Liu (Hunan), Yang Liu, Yang Liu (3M Health Information Systems), Yang Liu, Yang Liu, Yang Liu (UC Santa Cruz), Yang Liu (South China University of Technology), Yang Liu, Yang Liu (NTU), Yang Liu (Sun Yat-sen University), Yang Liu (North Carolina Central University), Yang Liu (Beijing Language and Culture University), Yang Liu (National University of Defense Technology), Yang Liu (Edinburgh Ph.D., Microsoft), Yang Liu (University of Helsinki), Yang Liu (The Chinese University of Hong Kong (Shenzhen)), Yang Liu (刘扬) (刘扬; Ph.D Purdue; ICSI, Dallas, Facebook, Liulishuo, Amazon), Yang Liu (刘洋) (刘洋; ICT, Tsinghua, Beijing Academy of Artificial Intelligence), Yang Liu (Microsoft Cognitive Services Research), Yang Liu (刘扬) (Peking University), Yang Liu (Samsung Research Center Beijing), Yang Liu (Tianjin University, China), Yang Liu (Univ. of Michigan, UC Santa Cruz), Yang Liu (Wilfrid Laurier University)

Unverified author pages with similar names: Yang Liu

2026

pdf bib abs

FFN Lens: How Transformers Divide Labor for Multilingual Tasks
Jiatong Li | Hailong Cao | Yang Liu
Findings of the Association for Computational Linguistics: ACL 2026

Large Language Models (LLMs) demonstrate strong performance in multilingual tasks, yet the process of constructing predictions in the target language remains under-explored. In this work, we introduce the FFN Lens, a novel interpretability method focusing on the Transformer’s core computational module, the Feed-Forward Network (FFN). By directly leveraging model parameters, the FFN Lens identifies both the critical units responsible for constructing specific information and the input features that drive them, which is essential for understanding Large Language Models. Applying FFN Lens to multilingual tasks, we demonstrate the prediction construction process and reveal the distinct division of labor across model layers. We identify a three-stage functional pipeline for constructing multilingual predictions: Latent Translation, Semantic Mapping, and Self Emphasis. We further introduce subspace analysis to validate this three-stage mechanism from a complementary perspective, and leverage these mechanistic insights to propose a training-free uncertainty estimation method.

Co-authors

Hailong Cao 1
Jiatong Li 1

Venues

Findings1

Fix author