2024
pdf
abs
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models
Weixiang Zhao
|
Shilong Wang
|
Yulin Hu
|
Yanyan Zhao
|
Bing Qin
|
Xuanyu Zhang
|
Qing Yang
|
Dongliang Xu
|
Wanxiang Che
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world. Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input, aiming at handling the challenges of catastrophic forgetting and knowledge transfer in CL. However, these methods tend to address only one of the challenges, ignoring the potential of aligning the two modules to effectively address catastrophic forgetting and knowledge transfer simultaneously. To this end, we propose a novel Shared Attention Framework (SAPT), to align the PET learning and selection via the Shared Attentive Learning & Selection module. Extensive Experiments on two CL benchmarks demonstrate the superiority of SAPT. Moreover, SAPT consistently demonstrates its superiority when we scale it to different model sizes (from 770M to 13B), different model architectures (T5 and LLaMA-2) and unseen tasks.
pdf
abs
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence
Weixiang Zhao
|
Zhuojun Li
|
Shilong Wang
|
Yang Wang
|
Yulin Hu
|
Yanyan Zhao
|
Chen Wei
|
Bing Qin
Findings of the Association for Computational Linguistics: ACL 2024
Emotional Intelligence (EI), consisting of emotion perception, emotion cognition and emotion expression, plays the critical roles in improving user interaction experience for the current large language model (LLM) based conversational general AI assistants. Previous works mainly focus on raising the emotion perception ability of them via naive fine-tuning on EI-related classification or regression tasks. However, this leads to the incomplete enhancement of EI and catastrophic forgetting of the general intelligence (GI). To this end, we first introduce EiBench, a large-scale collection of EI-related tasks in the text-to-text format with task instructions that covers all three aspects of EI, which lays a solid foundation for the comprehensive EI enhancement of LLMs. Then a novel Modular Emotional Intelligence enhancement method (**MoEI**), consisting of Modular Parameter Expansion and intra-inter modulation, is proposed to comprehensively enhance the EI of LLMs without compromise their GI. Extensive experiments on two representative LLM-based assistants, Flan-T5 and LLaMA-2-Chat, demonstrate the effectiveness of MoEI to improving EI while maintain GI.
2023
pdf
abs
TransESC: Smoothing Emotional Support Conversation via Turn-Level State Transition
Weixiang Zhao
|
Yanyan Zhao
|
Shilong Wang
|
Bing Qin
Findings of the Association for Computational Linguistics: ACL 2023
Emotion Support Conversation (ESC) is an emerging and challenging task with the goal of reducing the emotional distress of people. Previous attempts fail to maintain smooth transitions between utterances in ESC because they ignoring to grasp the fine-grained transition information at each dialogue turn. To solve this problem, we propose to take into account turn-level state Transitions of ESC (TransESC) from three perspectives, including semantics transition, strategy transition and emotion transition, to drive the conversation in a smooth and natural way. Specifically, we construct the state transition graph with a two-step way, named transit-then-interact, to grasp such three types of turn-level transition information. Finally, they are injected into the transition aware decoder to generate more engaging responses. Both automatic and human evaluations on the benchmark dataset demonstrate the superiority of TransESC to generate more smooth and effective supportive responses. Our source code will be publicly available.
2006
pdf
Reranking Answers for Definitional QA Using Language Modeling
Yi Chen
|
Ming Zhou
|
Shilong Wang
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics