Ruijie Zhang


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
Ziyue Liu | Ruijie Zhang | Zhengyang Wang | Mingsong Yan | Zi Yang | Paul D. Hovland | Bogdan Nicolae | Franck Cappello | Sui Tang | Zheng Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

The full-size MLPs and the projection layers in attention introduce tremendous model sizes of large language models (LLMs), consuming extensive computational resources in pre-training. We empirically observe that the activations of pre-trained LLMs exhibit low-rank property. Motivated by such observations, we propose **CoLA** and its memory-efficient implementation, **CoLA-M**, to replace these full-size layers with compute-efficient **auto-encoders** that naturally enforce low-rank activations throughout training. This fundamental architectural change eliminates the activation redundancy and significantly boosts model capacity and training efficiency. Experiments on LLaMA models with 60 million to 7 billion parameters show that CoLA reduces the computing cost by 2\pmb{\times} and improves training throughput by 1.86\pmb{\times} while maintaining full-rank level performance. CoLA-M further squeezes memory cost without sacrificing throughput, offering a pre-training approach with collectively superior parameter, computing, and memory efficiency. The LLMs produced are also 2\pmb{\times} smaller, enabling faster inference with lower memory cost on resource-constrained platforms.