Fei Yu
2023
HuatuoGPT, Towards Taming Language Model to Be a Doctor
Hongbo Zhang
|
Junying Chen
|
Feng Jiang
|
Fei Yu
|
Zhihong Chen
|
Guiming Chen
|
Jianquan Li
|
Xiangbo Wu
|
Zhang Zhiyi
|
Qingying Xiao
|
Xiang Wan
|
Benyou Wang
|
Haizhou Li
Findings of the Association for Computational Linguistics: EMNLP 2023
In this paper, we present HuatuoGPT, a Large Language Model (LLM) for medical consultation. The core recipe of HuatuoGPT is to leverage both distilled data from **ChatGPT** and real-world data from **doctors** in the supervised fine-tuning stage. This is not only because purely using **ChatGPT**-distilled data might cause ‘model collapse’, but also because real-world data from **doctors** would be complementary to **ChatGPT**-distilled data. The responses from ChatGPT are usually detailed, well-presented, fluent, and instruction-followed, but it cannot perform like a doctor in many aspects, e.g. for interactive diagnosis. Therefore, the extra doctors’ data could tame a distilled language model to perform like doctors. To synergize the strengths of both data sources, we introduce RLMF (Reinforcement Learning from Mixed Feedback) where a reward model is trained to align the language model with the merits that both sources (ChatGPT and doctors) bring. Experimental results (in GPT-4 evaluation, human evaluation, and medical benchmark datasets) demonstrate that HuatuoGPT achieves state-of-the-art results in performing medical consultation among open-source LLMs. It is worth noting that by using additional real-world data and RLMF, the distilled language model (i.e., HuatuoGPT) outperforms its teacher model (i.e., ChatGPT) in most cases.
Search
Co-authors
- Hongbo Zhang 1
- Junying Chen 1
- Feng Jiang 1
- Zhihong Chen 1
- Guiming Chen 1
- show all...