Wenke Lee
2026
Behavior Knowledge Merge in Reinforced Agentic Models
Xiangchi Yuan | Dachuan Shi | Chunhui Zhang | Zheyuan Liu | Shenglong Yao | Soroush Vosoughi | Wenke Lee
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xiangchi Yuan | Dachuan Shi | Chunhui Zhang | Zheyuan Liu | Shenglong Yao | Soroush Vosoughi | Wenke Lee
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Reinforcement learning (RL) is central to post-training, particularly for agentic models that require specialized reasoning behaviors. In this setting, model merging offers a practical mechanism for integrating multiple RL-trained agents from different tasks into a single generalist model. However, existing merging methods are designed for supervised fine-tuning (SFT), and they are suboptimal to preserve task-specific capabilities on RL-trained agentic models. The root is a task-vector mismatch between RL and SFT: on-policy RL induces task vectors that are highly sparse and heterogeneous, whereas SFT-style merging implicitly assumes dense and globally comparable task vectors. When standard global averaging is applied under this mismatch, RL’s non-overlapping task vectors that encode critical task-specific behaviors are reduced and parameter updates are diluted. To address this issue, we propose Reinforced Agent Merging (RAM), a distribution-aware merging framework explicitly designed for RL-trained agentic models. RAM disentangles shared and task-specific unique parameter updates, averaging shared components while selectively preserving and rescaling unique ones to counteract parameter update dilution. Experiments across multiple agent domains and model architectures demonstrate that RAM not only surpasses merging baselines, but also unlocks synergistic potential among agents to achieve performance superior to that of specialized agents in their domains.
2025
Superficial Self-Improved Reasoners Benefit from Model Merging
Xiangchi Yuan | Chunhui Zhang | Zheyuan Liu | Dachuan Shi | Leyan Pan | Soroush Vosoughi | Wenke Lee
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Xiangchi Yuan | Chunhui Zhang | Zheyuan Liu | Dachuan Shi | Leyan Pan | Soroush Vosoughi | Wenke Lee
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) rely heavily on large-scale reasoning data, but as such data becomes increasingly scarce, model self-improvement offers a promising alternative. However, this process can lead to model collapse, as the model’s output becomes overly deterministic with reduced diversity. In this work, we identify a new risk beyond model collapse, which we term the Superficial Self-Improved Reasoners phenomenon. This phenomenon indicates that while self-improvement enhances in-domain (ID) reasoning accuracy, it degrades the model’s generalized reasoning capability on out-of-domain (OOD) datasets, as the model tends to memorize the training data. Our analyses of layer importance and parameter changes reveal that reasoning-critical layers receive fewer updates compared to less relevant layers during self-improvement. To address this, we propose Iterative Model Merging (IMM), which balances reasoning improvements and generalization by merging the weights of the original and self-improved models. IMM effectively mitigates model collapse and improves generalized reasoning capability. Code is available at https://github.com/xiangchi-yuan/merge_syn