2024
pdf
abs
Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback
Chengfeng Dou
|
Ying Zhang
|
Zhi Jin
|
Wenpin Jiao
|
Haiyan Zhao
|
Yongqiang Zhao
|
Zhengwei Tao
Findings of the Association for Computational Linguistics: ACL 2024
The utilization of large language models for medical dialogue generation has attracted considerable attention due to its potential to enhance response richness and coherence. While previous studies have made strides in optimizing model performance, there is a pressing need to bolster the model’s capacity for diagnostic logic to ensure patient safety. In response to this need, we propose an approach termed preference learning from process feedback (PLPF), which involves integrating the doctor’s diagnostic logic into LLMs. PLPF encompasses three key components: rule modeling, preference data generation, and preference alignment. These components collectively serve to train the model to adhere to the diagnostic process. Our experimental results, utilizing Standardized Patient Testing, demonstrate that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%, surpassing the performance of traditional approaches. Moreover, PLPF exhibits effectiveness in both multi-round and single-round dialogue tasks, thereby highlighting its potential in improving medical dialogue generation. Our dataset is available at https://github.com/Chengfeng-Dou/SpTesting.
2023
pdf
abs
UniEvent: Unified Generative Model with Multi-Dimensional Prefix for Zero-Shot Event-Relational Reasoning
Zhengwei Tao
|
Zhi Jin
|
Haiyan Zhao
|
Chengfeng Dou
|
Yongqiang Zhao
|
Tao Shen
|
Chongyang Tao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Reasoning about events and their relations attracts surging research efforts since it is regarded as an indispensable ability to fulfill various event-centric or common-sense reasoning tasks. However, these tasks often suffer from limited data availability due to the labor-intensive nature of their annotations. Consequently, recent studies have explored knowledge transfer approaches within a multi-task learning framework to address this challenge. Although such methods have achieved acceptable results, such brute-force solutions struggle to effectively transfer event-relational knowledge due to the vast array of inter-event relations (e.g. temporal, causal, conditional) and reasoning formulations (e.g. discriminative, abductive, ending prediction). To enhance knowledge transfer and enable zero-shot generalization among various combinations, in this work we propose a novel unified framework, called UNIEVENT. Inspired by prefix-based multitask learning, our approach organizes event relational reasoning tasks into a coordinate system with multiple axes, representing inter-event relations and reasoning formulations. We then train a unified text-to-text generative model that utilizes coordinate-assigning prefixes for each task. By leveraging our adapted prefixes, our unified model achieves state-of-the-art or competitive performance on both zero-shot and supervised reasoning tasks, as demonstrated in extensive experiments
pdf
abs
SEAG: Structure-Aware Event Causality Generation
Zhengwei Tao
|
Zhi Jin
|
Xiaoying Bai
|
Haiyan Zhao
|
Chengfeng Dou
|
Yongqiang Zhao
|
Fang Wang
|
Chongyang Tao
Findings of the Association for Computational Linguistics: ACL 2023
Extracting event causality underlies a broad spectrum of natural language processing applications. Cutting-edge methods break this task into Event Detection and Event Causality Identification. Although the pipelined solutions succeed in achieving acceptable results, the inherent nature of separating the task incurs limitations. On the one hand, it suffers from the lack of cross-task dependencies and may cause error propagation. On the other hand, it predicts events and relations separately, undermining the integrity of the event causality graph (ECG). To address such issues, in this paper, we propose an approach for Structure-Aware Event Causality Generation (SEAG). With a graph linearization module, we generate the ECG structure in a way of text2text generation based on a pre-trained language model. To foster the structural representation of the ECG, we introduce the novel Causality Structural Discrimination training paradigm in which we perform structural discriminative training alongside auto-regressive generation enabling the model to distinguish from constructed incorrect ECGs. We conduct experiments on three datasets. The experimental results demonstrate the effectiveness of structural event causality generation and the causality structural discrimination training.
pdf
abs
PlugMed: Improving Specificity in Patient-Centered Medical Dialogue Generation using In-Context Learning
Chengfeng Dou
|
Zhi Jin
|
Wenpin Jiao
|
Haiyan Zhao
|
Yongqiang Zhao
|
Zhengwei Tao
Findings of the Association for Computational Linguistics: EMNLP 2023
The patient-centered medical dialogue systems strive to offer diagnostic interpretation services to users who are less knowledgeable about medical knowledge, through emphasizing the importance of providing responses specific to the patients. It is difficult for the large language models (LLMs) to guarantee the specificity of responses in spite of its promising performance even in some tasks in medical field. Inspired by in-context learning, we propose PlugMed, a Plug-and-Play Medical Dialogue System, for addressing this challenge. PlugMed is equipped with two modules, the prompt generation (PG) module and the response ranking (RR) module, to enhances LLMs’ dialogue strategies for improving the specificity of the dialogue. The PG module is designed to stimulate the imitative ability of LLMs by providing them with real dialogues from similar patients as prompts. The RR module incorporates fine-tuned small model as response filter to enable the selection of appropriate responses generated by LLMs. Furthermore, we introduce a new evaluation method based on matching both user’s intent and high-frequency medical term to effectively assess the specificity of the responses. We conduct experimental evaluations on three medical dialogue datasets, and the results, including both automatic and human evaluation, demonstrate the effectiveness of our approach.
2020
pdf
abs
DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification
Lianwei Wu
|
Yuan Rao
|
Yongqiang Zhao
|
Hao Liang
|
Ambreen Nazir
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Recently, many methods discover effective evidence from reliable sources by appropriate neural networks for explainable claim verification, which has been widely recognized. However, in these methods, the discovery process of evidence is nontransparent and unexplained. Simultaneously, the discovered evidence is aimed at the interpretability of the whole sequence of claims but insufficient to focus on the false parts of claims. In this paper, we propose a Decision Tree-based Co-Attention model (DTCA) to discover evidence for explainable claim verification. Specifically, we first construct Decision Tree-based Evidence model (DTE) to select comments with high credibility as evidence in a transparent and interpretable way. Then we design Co-attention Self-attention networks (CaSa) to make the selected evidence interact with claims, which is for 1) training DTE to determine the optimal decision thresholds and obtain more powerful evidence; and 2) utilizing the evidence to find the false parts in the claim. Experiments on two public datasets, RumourEval and PHEME, demonstrate that DTCA not only provides explanations for the results of claim verification but also achieves the state-of-the-art performance, boosting the F1-score by more than 3.11%, 2.41%, respectively.