Maolin Wang
2026
MemSearch-o1: Empowering Large Language Models with Reasoning-Aligned Memory Growth in Agentic Search
Sheng Zhang | Junyi Li | Yingyi Zhang | Pengyue Jia | Yichao Wang | Xiaowei Qian | Wenlin Zhang | Maolin Wang | Yong Liu | Xiangyu Zhao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sheng Zhang | Junyi Li | Yingyi Zhang | Pengyue Jia | Yichao Wang | Xiaowei Qian | Wenlin Zhang | Maolin Wang | Yong Liu | Xiangyu Zhao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent advances in large language models (LLMs) have scaled the potential for reasoning and agentic search, wherein models autonomously plan, retrieve, and reason over external knowledge to answer complex queries. However, the iterative think–search loop accumulates long system memories, leading to memory dilution problem. In addition, existing memory management methods struggle to capture fine-grained semantic relations between queries and documents and often lose substantial information. Therefore, we propose MemSearch-o1, an agentic search framework built on reasoning-aligned memory growth and retracing. MemSearch-o1 dynamically grows fine-grained memory fragments from memory seed tokens from the queries, then retraces and deeply refines the memory via a contribution function, and finally reorganizes a globally connected memory path. This shifts memory management from stream-like concatenation to structured, token-level growth with path-based reasoning. Experiments on eight benchmark datasets show that MemSearch-o1 substantially mitigates memory dilution, and more effectively activates the reasoning potential of diverse LLMs, establishing a solid foundation for memory-aware agentic intelligence.
MTA:A Merge-then-Adapt Framework for Personalized Large Language Models
Xiaopeng Li | Yuanjin Zheng | Wanyu Wang | Wenlin Zhang | Pengyue Jia | Yingyi Zhang | Haiying He | Mengyang Ma | Yiqi Wang | Maolin Wang | Xuetao Wei | Xiangyu Zhao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xiaopeng Li | Yuanjin Zheng | Wanyu Wang | Wenlin Zhang | Pengyue Jia | Yingyi Zhang | Haiying He | Mengyang Ma | Yiqi Wang | Maolin Wang | Xuetao Wei | Xiangyu Zhao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Personalized Large Language Models (PLLMs) aim to align model outputs with individual user preferences, a crucial capability for user-centric applications. However, the prevalent approach of fine-tuning a separate module for each user faces two major limitations: (1) storage costs scale linearly with the number of users, rendering the method unscalable; and (2) fine-tuning a static model from scratch often yields suboptimal performance for users with sparse data. To address these challenges, we propose MTA, a Merge-then-Adapt framework for PLLMs. MTA comprises three key stages. First, we construct a shared Meta-LoRA Bank by selecting anchor users and pre-training meta-personalization traits within meta-LoRA modules. Second, to ensure scalability and enable dynamic personalization combination beyond static models, we introduce an Adaptive LoRA Fusion stage. This stage retrieves and dynamically merges the most relevant anchor meta-LoRAs to synthesize a user-specific one, thereby eliminating the need for user-specific storage and supporting more flexible personalization. Third, we propose a LoRA Stacking for Few-Shot Personalization stage, which applies an additional ultra-low-rank, lightweight LoRA module on top of the merged LoRA. Fine-tuning this module enables effective personalization under few-shot settings. Extensive experiments on the LaMP benchmark demonstrate that our approach outperforms existing SOTA methods across multiple tasks. Our code is also available.
2025
Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation
Pengyue Jia | Derong Xu | Xiaopeng Li | Zhaocheng Du | Xiangyang Li | Yichao Wang | Yuhao Wang | Qidong Liu | Maolin Wang | Huifeng Guo | Ruiming Tang | Xiangyu Zhao
Findings of the Association for Computational Linguistics: ACL 2025
Pengyue Jia | Derong Xu | Xiaopeng Li | Zhaocheng Du | Xiangyang Li | Yichao Wang | Yuhao Wang | Qidong Liu | Maolin Wang | Huifeng Guo | Ruiming Tang | Xiangyu Zhao
Findings of the Association for Computational Linguistics: ACL 2025
The reranker and generator are two critical components in the Retrieval-Augmented Generation (i.e., RAG) pipeline, responsible for ranking relevant documents and generating responses. However, due to differences in pre-training data and objectives, there is an inevitable gap between the documents ranked as relevant by the reranker and those required by the generator to support answering the query. To address this gap, we propose RADIO, a novel and practical preference alignment framework with RAtionale DIstillatiOn. Specifically, We first propose a rationale extraction method that leverages the reasoning capabilities of large language models (LLMs) to extract the rationales necessary for answering the query. Subsequently, a rationale-based alignment process is designed to rerank the documents based on the extracted rationales, and fine-tune the reranker to align the preferences. We conduct extensive experiments on two tasks across three datasets to demonstrate the effectiveness of our approach compared to baseline methods. Our code is released online to ease reproduction.
Stepwise Reasoning Disruption Attack of LLMs
Jingyu Peng | Maolin Wang | Xiangyu Zhao | Kai Zhang | Wanyu Wang | Pengyue Jia | Qidong Liu | Ruocheng Guo | Qi Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jingyu Peng | Maolin Wang | Xiangyu Zhao | Kai Zhang | Wanyu Wang | Pengyue Jia | Qidong Liu | Ruocheng Guo | Qi Liu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) have made remarkable strides in complex reasoning tasks, but their safety and robustness in reasoning processes remain unexplored, particularly in third-party platforms that facilitate user interactions via APIs. Existing attacks on LLM reasoning are constrained by specific settings or lack of imperceptibility, limiting their feasibility and generalizability. To address these challenges, we propose the Stepwise rEasoning Error Disruption (SEED) attack, which subtly injects errors into prior reasoning steps to mislead the model into producing incorrect subsequent reasoning and final answers. Unlike previous methods, SEED is compatible with zero-shot and few-shot settings, maintains the natural reasoning flow, and ensures covert execution without modifying the instruction. Extensive experiments on four datasets across four different models demonstrate SEED’s effectiveness, revealing the vulnerabilities of LLMs to disruptions in reasoning processes. These findings underscore the need for greater attention to the robustness of LLM reasoning to ensure safety in practical applications. Our code is available at: https://github.com/Applied-Machine-Learning-Lab/SEED-Attack
2015
Search
Fix author
Co-authors
- Pengyue Jia 4
- Xiangyu Zhao 4
- Xiaopeng Li 2
- Qidong Liu 2
- Wanyu Wang 2
- Yichao Wang 2
- Wenlin Zhang 2
- Yingyi Zhang 2
- Zhaocheng Du 1
- Huifeng Guo 1
- Ruocheng Guo 1
- Haiying He 1
- Mingxuan Huang 1
- Junyi Li 1
- Xiangyang Li 1
- Qi Liu 1
- Yong Liu 1
- Mengyang Ma 1
- Shervin Malmasi 1
- Jingyu Peng 1
- Xiaowei Qian 1
- Ruiming Tang 1
- Yiqi Wang 1
- Yuhao Wang 1
- Xuetao Wei 1
- Derong Xu 1
- Kai Zhang 1
- Sheng Zhang 1
- Yuanjin Zheng 1