Wei Zhang

Other people with similar names: Wei Zhang , Wei Zhang , Wei Zhang , Wei Zhang , Wei Zhang


2025

pdf bib
Pretraining Context Compressor for Large Language Models with Embedding-Based Memory
Yuhong Dai | Jianxun Lian | Yitian Huang | Wei Zhang | Mingyang Zhou | Mingqi Wu | Xing Xie | Hao Liao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Efficient processing of long contexts in large language models (LLMs) is essential for real-world applications like retrieval-augmented generation and in-context learning, especially in resource-constrained environments such as edge computing. This paper explores the embedding-based context compression to reduce inference costs while preserving the downstream LLM configurations. We propose a decoupled compressor-LLM framework, pretrained on text reconstruction and completion tasks, designed to effectively preserve essential contextual information within condensed embedding representations. Our extensive experiments investigate pretraining, model configurations, compression rates, efficiency across tasks, and adaptability to various LLMs. Results demonstrate that our approach outperforms competitive baselines in three domains and across eight datasets while being adaptable to different downstream LLMs. We find that thorough pretraining and carefully selected compression rates, such as 4x and 16x, enable a lightweight compressor to achieve a good balance between accuracy and speed. These findings underscore the potential of embedding-based compression to enhance LLM efficiency and motivate further research in this area.

pdf bib
R-CHAR: A Metacognition-Driven Framework for Role-Playing in Large Language Models
Haiming Qin | Jiwei Zhang | Wei Zhang | KeZhong Lu | Mingyang Zhou | Hao Liao | Rui Mao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Role-playing capabilities in large language models (LLMs) often lack cognitive consistency in complex scenarios that require deep understanding and coherent reasoning. While recent reasoning models excel in math and coding tasks, they show limited effectiveness in open-ended role-playing scenarios. We introduce R-CHAR (Role-Consistent Hierarchical Adaptive Reasoning), a metacognition-driven framework that enhances role-playing performance through guided thinking trajectories synthesis and adaptive evaluation. Our approach demonstrates that concise thinking processes can achieve superior performance efficiently compared to elaborate reasoning chains in role-playing social intelligence tasks, outperforming existing specialized models. Experimental results on the SocialBench benchmark show significant and stable performance improvements across varying scenario complexities, showing particular strength in long-context comprehension (from 34.64% to 68.59%) and group-level social interactions. Our work advances the development of cognitively consistent role-playing systems, bridging the gap between surface-level mimicry and authentic character simulation.