Zheming Yang
2025
Decoder-Only LLMs can be Masked Auto-Encoders
Dan Qiao
|
Yuan Gao
|
Zheming Yang
|
Di Yang
|
Ziheng Wu
|
Pengcheng Lu
|
Minghui Qiu
|
Juntao Li
|
Min Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Modern NLP workflows (e.g., RAG systems) require different models for generation and embedding tasks, where bidirectional pre-trained encoders and decoder-only Large Language Models (LLMs) dominate respective tasks. Structural differences between models result in extra development costs and limit knowledge sharing between tasks. In this work, we present UniMAE, a novel unsupervised training method that transforms an Decoder-Only LLM into a Uni-Directional Masked Auto-Encoder. UniMAE compresses high-quality semantic information into the [EOS] embedding while preserving the generation capabilities of LLMs. Comprehensive evaluations across 56 MTEB datasets demonstrate that UniMAE can achieve state-of-the-art results under unsupervised settings with merely 100 training steps, establishing the first effective approach to unifying generation and representation learning in decoder-only architectures.
Search
Fix author
Co-authors
- Yuan Gao 1
- Juntao Li 1
- Pengcheng Lu 1
- Dan Qiao 1
- Minghui Qiu 1
- show all...
Venues
- acl1