Peng Chang
2025
Prefix-Enhanced Large Language Models with Reused Training Data in Multi-Turn Medical Dialogue
Suxue Ma
|
Zhicheng Yang
|
Ruei-Sung Lin
|
Youbao Tang
|
Ning Zhang
|
Zhenjie Cao
|
Yuan Ni
|
Jing Xiao
|
Jieke Hou
|
Peng Chang
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)
Large Language Models have made impressive progress in the medical field. In medical dialogue scenarios, unlike traditional single-turn question-answering tasks, multi-turn doctor-patient dialogue tasks require AI doctors to interact with patients in multiple rounds, where the quality of each response impacts the overall model performance. In this paper, we propose PERT to re-explore values of multi-turn dialogue training data after the supervised fine-tuning phase by integrating a prefix learning strategy, further enhancing the response quality. Our preliminary results show that PERT achieves notable improvements on gynecological data, with an increase of up to 0.22 on a 5-point rating scale.
2022
Self-supervised Cross-modal Pretraining for Speech Emotion Recognition and Sentiment Analysis
Iek-Heng Chu
|
Ziyi Chen
|
Xinlu Yu
|
Mei Han
|
Jing Xiao
|
Peng Chang
Findings of the Association for Computational Linguistics: EMNLP 2022
Multimodal speech emotion recognition (SER) and sentiment analysis (SA) are important techniques for human-computer interaction. Most existing multimodal approaches utilize either shallow cross-modal fusion of pretrained features, or deep cross-modal fusion with raw features. Recently, attempts have been made to fuse pretrained feature representations in a deep fusion manner during fine-tuning stage. However those approaches have not led to improved results, partially due to their relatively simple fusion mechanisms and lack of proper cross-modal pretraining. In this work, leveraging single-modal pretrained models (RoBERTa and HuBERT), we propose a novel deeply-fused audio-text bi-modal transformer with carefully designed cross-modal fusion mechanism and a stage-wise cross-modal pretraining scheme to fully facilitate the cross-modal learning. Our experiment results show that the proposed method achieves state-of-the-art results on the public IEMOCAP emotion and CMU-MOSEI sentiment datasets, exceeding the previous benchmarks by a large margin.