2025
pdf
bib
abs
Beyond Single Labels: Improving Conversational Recommendation through LLM-Powered Data Augmentation
Haozhe Xu
|
Xiaohua Wang
|
Changze Lv
|
Xiaoqing Zheng
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Conversational recommender systems (CRSs) enhance recommendation quality by engaging users in multi-turn dialogues, capturing nuanced preferences through natural language interactions. However, these systems often face the false negative issue, where items that a user might like are incorrectly labeled as negative during training, leading to suboptimal recommendations. Expanding the label set through data augmentation presents an intuitive solution but faces the challenge of balancing two key aspects: ensuring semantic relevance and preserving the collaborative information inherent in CRS datasets. To address these issues, we propose a novel data augmentation framework that first leverages an LLM-based semantic retriever to identify diverse and semantically relevant items, which are then filtered by a relevance scorer to remove noisy candidates. Building on this, we introduce a two-stage training strategy balancing semantic relevance and collaborative information. Extensive experiments on two benchmark datasets and user simulators demonstrate significant and consistent performance improvements across various recommenders, highlighting the effectiveness of our approach in advancing CRS performance.
pdf
bib
abs
Multi-Programming Language Sandbox for LLMs
Shihan Dou
|
Jiazheng Zhang
|
Jianxiang Zang
|
Yunbo Tao
|
Weikang Zhou
|
Haoxiang Jia
|
Shichun Liu
|
Yuming Yang
|
Shenxi Wu
|
Zhiheng Xi
|
Muling Wu
|
Rui Zheng
|
Changze Lv
|
Limao Xiong
|
Shaoqing Zhang
|
Lin Zhang
|
Wenyu Zhan
|
Rongxiang Weng
|
Jingang Wang
|
Xunliang Cai
|
Yueming Wu
|
Ming Wen
|
Yixin Cao
|
Tao Gui
|
Xipeng Qiu
|
Qi Zhang
|
Xuanjing Huang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
We introduce MPLSandbox, an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler and analysis tools for Large Language Models (LLMs). It can automatically identify the programming language of the code, compiling and executing it within an isolated sub-sandbox to ensure safety and stability. In addition, MPLSandbox integrates both traditional and LLM-based code analysis tools, providing a comprehensive analysis of generated code. It also can be effortlessly integrated into the training and deployment of LLMs to improve the quality and correctness of generated code. It also helps researchers streamline their workflows for various LLM-based code-related tasks, reducing the development cost. To validate the effectiveness of MPLSandbox, we conduct extensive experiments by integrating it into several training and deployment scenarios, and employing it to optimize workflows for a wide range of downstream code tasks. Our goal is to enhance researcher productivity on LLM-based code tasks by simplifying and automating workflows through delegation to MPLSandbox.
pdf
bib
abs
Revisiting Jailbreaking for Large Language Models: A Representation Engineering Perspective
Tianlong Li
|
Zhenghua Wang
|
Wenhao Liu
|
Muling Wu
|
Shihan Dou
|
Changze Lv
|
Xiaohua Wang
|
Xiaoqing Zheng
|
Xuanjing Huang
Proceedings of the 31st International Conference on Computational Linguistics
The recent surge in jailbreaking attacks has revealed significant vulnerabilities in Large Language Models (LLMs) when exposed to malicious inputs. While various defense strategies have been proposed to mitigate these threats, there has been limited research into the underlying mechanisms that make LLMs vulnerable to such attacks. In this study, we suggest that the self-safeguarding capability of LLMs is linked to specific activity patterns within their representation space. Although these patterns have little impact on the semantic content of the generated text, they play a crucial role in shaping LLM behavior under jailbreaking attacks. Our findings demonstrate that these patterns can be detected with just a few pairs of contrastive queries. Extensive experimentation shows that the robustness of LLMs against jailbreaking can be manipulated by weakening or strengthening these patterns. Further visual analysis provides additional evidence for our conclusions, providing new insights into the jailbreaking phenomenon. These findings highlight the importance of addressing the potential misuse of open-source LLMs within the community.
pdf
bib
abs
Tell Me What You Don’t Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Wenhao Liu
|
Siyu An
|
Junru Lu
|
Muling Wu
|
Tianlong Li
|
Xiaohua Wang
|
Changze Lv
|
Xiaoqing Zheng
|
Di Yin
|
Xing Sun
|
Xuanjing Huang
Findings of the Association for Computational Linguistics: ACL 2025
Role-Playing Agents (RPAs) have shown remarkable performance in various applications, yet they often struggle to recognize and appropriately respond to hard queries that conflict with their role-play knowledge. To investigate RPAs’ performance when faced with different types of conflicting requests, we develop an evaluation benchmark that includes contextual knowledge conflicting requests, parametric knowledge conflicting requests, and non-conflicting requests to assess RPAs’ ability to identify conflicts and refuse to answer appropriately without over-refusing. Through extensive evaluation, we find that most RPAs behave significant performance gaps toward different conflict requests. To elucidate the reasons, we conduct an in-depth representation-level analysis of RPAs under various conflict scenarios. Our findings reveal the existence of rejection regions and direct response regions within the model’s forwarding representation, and thus influence the RPA’s final response behavior. Therefore, we introduce a lightweight representation editing approach that conveniently shifts conflicting requests to the rejection region, thereby enhancing the model’s refusal accuracy. The extensive experiments validate the effectiveness of our editing method, improving RPAs’ refusal ability of conflicting requests while maintaining their general role-playing capabilities.
pdf
bib
abs
TripTailor: A Real-World Benchmark for Personalized Travel Planning
Kaimin Wang
|
Yuanzhe Shen
|
Changze Lv
|
Xiaoqing Zheng
|
Xuanjing Huang
Findings of the Association for Computational Linguistics: ACL 2025
The continuous evolution and enhanced reasoning capabilities of large language models (LLMs) have elevated their role in complex tasks, notably in travel planning, where demand for personalized, high-quality itineraries is rising. However, current benchmarks often rely on unrealistic simulated data, failing to reflect the differences between LLM-generated and real-world itineraries. Existing evaluation metrics, which primarily emphasize constraints, fall short of providing a comprehensive assessment of the overall quality of travel plans. To address these limitations, we introduce TripTailor, a benchmark designed specifically for personalized travel planning in real-world scenarios. This dataset features an extensive collection of over 500,000 real-world points of interest (POIs) and nearly 4,000 diverse travel itineraries, complete with detailed information, providing a more authentic evaluation framework. Experiments show that fewer than 10% of the itineraries generated by the latest state-of-the-art LLMs achieve human-level performance. Moreover, we identify several critical challenges in travel planning, including the feasibility, rationality, and personalized customization of the proposed solutions. We hope that TripTailor will drive the development of travel planning agents capable of understanding and meeting user needs while generating practical itineraries.
pdf
bib
abs
Improving Continual Pre-training Through Seamless Data Packing
Ruicheng Yin
|
Xuan Gao
|
Changze Lv
|
Xiaohua Wang
|
Xiaoqing Zheng
|
Xuanjing Huang
Findings of the Association for Computational Linguistics: ACL 2025
Continual pre-training has demonstrated significant potential in enhancing model performance, particularly in domain-specific scenarios. The most common approach for packing data before continual pre-training involves concatenating input texts and splitting them into fixed-length sequences. While straightforward and efficient, this method often leads to excessive truncation and context discontinuity, which can hinder model performance. To address these issues, we explore the potential of data engineering to enhance continual pre-training, particularly its impact on model performance and efficiency. We propose Seamless Packing (SP), a novel data packing strategy aimed at preserving contextual information and enhancing model performance. Our approach employs a sliding window technique in the first stage that synchronizes overlapping tokens across consecutive sequences, ensuring better continuity and contextual coherence. In the second stage, we adopt a First-Fit-Decreasing algorithm to pack shorter texts into bins slightly larger than the target sequence length, thereby minimizing padding and truncation. Empirical evaluations across various model architectures and corpus domains demonstrate the effectiveness of our method, outperforming baselines in 99% of all settings. Code is available at https://github.com/Infernus-WIND/Seamless-Packing.
2024
pdf
bib
abs
Aligning Large Language Models with Human Preferences through Representation Engineering
Wenhao Liu
|
Xiaohua Wang
|
Muling Wu
|
Tianlong Li
|
Changze Lv
|
Zixuan Ling
|
Zhu JianHao
|
Cenyuan Zhang
|
Xiaoqing Zheng
|
Xuanjing Huang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Aligning large language models (LLMs) with human preferences is crucial for enhancing their utility in terms of helpfulness, truthfulness, safety, harmlessness, and interestingness. Existing methods for achieving this alignment often involve employing reinforcement learning from human feedback (RLHF) to fine-tune LLMs based on human labels assessing the relative quality of model responses. Nevertheless, RLHF is susceptible to instability during fine-tuning and presents challenges in implementation. Drawing inspiration from the emerging field of representation engineering (RepE), this study aims to identify relevant representations for high-level human preferences embedded in patterns of activity within an LLM and achieve precise control of model behavior by transforming its representations. This novel approach, denoted as Representation Alignment from Human Feedback (RAHF), proves to be effective, computationally efficient, and easy to implement. Extensive experiments demonstrate the efficacy of RAHF in not only capturing but also manipulating representations to align with a broad spectrum of human preferences or values, rather than being confined to a singular concept or function (e.g. honesty or bias). RAHF’s versatility in accommodating diverse human preferences shows its potential for advancing LLM performance.
pdf
bib
abs
Advancing Parameter Efficiency in Fine-tuning via Representation Editing
Muling Wu
|
Wenhao Liu
|
Xiaohua Wang
|
Tianlong Li
|
Changze Lv
|
Zixuan Ling
|
Zhu JianHao
|
Cenyuan Zhang
|
Xiaoqing Zheng
|
Xuanjing Huang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Parameter Efficient Fine-Tuning (PEFT) has gained significant attention for its ability to achieve competitive results while updating only a small subset of trainable parameters. Despite the promising performance of current PEFT methods, they present challenges in hyperparameter selection, such as determining the rank of LoRA or Adapter, or specifying the length of soft prompts. In addressing these challenges, we propose a novel approach to fine-tuning neural models, termed Representation EDiting (RED), which scales and biases the representation produced at each layer. RED substantially reduces the number of trainable parameters by a factor of 25,700 compared to full parameter fine-tuning, and by a factor of 32 compared to LoRA. Remarkably, RED achieves comparable or superior results to full parameter fine-tuning and other PEFT methods. Extensive experiments were conducted across models of varying architectures and scales, including RoBERTa, GPT-2, T5, and Llama-2, and the results demonstrate the efficiency and efficacy of RED, positioning it as a promising PEFT approach for large neural models.
pdf
bib
abs
Searching for Best Practices in Retrieval-Augmented Generation
Xiaohua Wang
|
Zhenghua Wang
|
Xuan Gao
|
Feiran Zhang
|
Yixin Wu
|
Zhibo Xu
|
Tianyuan Shi
|
Zhengyuan Wang
|
Shizheng Li
|
Qi Qian
|
Ruicheng Yin
|
Changze Lv
|
Xiaoqing Zheng
|
Xuanjing Huang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolonged response times. Typically, a RAG workflow involves multiple processing steps, each of which can be executed in various ways. Here, we investigate existing RAG approaches and their potential combinations to identify optimal RAG practices. Through extensive experiments, we suggest several strategies for deploying RAG that balance both performance and efficiency. Moreover, we demonstrate that multimodal retrieval techniques can significantly enhance question-answering capabilities about visual inputs and accelerate the generation of multimodal content using a “retrieval as generation” strategy.
pdf
bib
abs
Promoting Data and Model Privacy in Federated Learning through Quantized LoRA
Zhu JianHao
|
Changze Lv
|
Xiaohua Wang
|
Muling Wu
|
Wenhao Liu
|
Tianlong Li
|
Zixuan Ling
|
Cenyuan Zhang
|
Xiaoqing Zheng
|
Xuanjing Huang
Findings of the Association for Computational Linguistics: EMNLP 2024
Conventional federated learning primarily aims to secure the privacy of data distributed across multiple edge devices, with the global model dispatched to edge devices for parameter updates during the learning process. However, the development of large language models (LLMs) requires substantial data and computational resources, rendering them valuable intellectual properties for their developers and owners. To establish a mechanism that protects both data and model privacy in a federated learning context, we introduce a method that just needs to distribute a quantized version of the model’s parameters during training. This method enables accurate gradient estimations for parameter updates while preventing clients from accessing a model whose performance is comparable to the centrally hosted one. Moreover, we combine this quantization strategy with LoRA, a popular and parameter-efficient fine-tuning method, to significantly reduce communication costs in federated learning. The proposed framework, named FedLPP, successfully ensures both data and model privacy in the federated learning context. Additionally, the learned central model exhibits good generalization and can be trained in a resource-efficient manner.
2023
pdf
bib
abs
Parameter Efficient Multi-task Fine-tuning by Learning to Transfer Token-wise Prompts
Muling Wu
|
Wenhao Liu
|
Jianhan Xu
|
Changze Lv
|
Zixuan Ling
|
Tianlong Li
|
Longtao Huang
|
Xiaoqing Zheng
|
Xuanjing Huang
Findings of the Association for Computational Linguistics: EMNLP 2023
Prompt tuning has been proven to be successful on various tasks by incorporating a small number of trainable parameters while freezing large pre-trained language models (PLMs). However, it is still unsettled how to generate more proper prompts for any individual examples and how to extend prompt tuning to multi-task learning scenarios by leveraging cross-task features. To address these challenges, we propose a token-wise prompt tuning (TPT), in which a bank of finer-grained soft prompt tokens is built for multi-task learning by memory network. The tokens are retrieved from the bank against an input example and assembled to an instance-dependent prompt. Extensive experimental results on 14 datasets demonstrated that the models enhanced by our TPT performed far better than full parameter fine-tuned models and achieved state-of-the-art by tuning only 0.035% parameters.