ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination

Dongfang Li; Jindi Yu; Baotian Hu; Zhenran Xu; Min Zhang

doi:10.18653/v1/2023.findings-emnlp.129

ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination

Dongfang Li, Jindi Yu, Baotian Hu, Zhenran Xu, Min Zhang

Abstract

In the field of Large Language Models (LLMs), researchers are increasingly exploring their effectiveness across a wide range of tasks. However, a critical area that requires further investigation is the interpretability of these models, particularly the ability to generate rational explanations for their decisions. Most existing explanation datasets are limited to the English language and the general domain, which leads to a scarcity of linguistic diversity and a lack of resources in specialized domains, such as medical. To mitigate this, we propose ExplainCPE, a challenging medical dataset consisting of over 7K problems from Chinese Pharmacist Examination, specifically tailored to assess the model-generated explanations. From the overall results, only GPT-4 passes the pharmacist examination with a 75.7% accuracy, while other models like ChatGPT fail. Further detailed analysis of LLM-generated explanations reveals the limitations of LLMs in understanding medical text and executing computational reasoning. With the increasing importance of AI safety and trustworthiness, ExplainCPE takes a step towards improving and evaluating the interpretability of LLMs in the medical domain.

Anthology ID:: 2023.findings-emnlp.129
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1922–1940
Language:
URL:: https://aclanthology.org/2023.findings-emnlp.129
DOI:: 10.18653/v1/2023.findings-emnlp.129
Bibkey:
Cite (ACL):: Dongfang Li, Jindi Yu, Baotian Hu, Zhenran Xu, and Min Zhang. 2023. ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1922–1940, Singapore. Association for Computational Linguistics.
Cite (Informal):: ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination (Li et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp-22-attachments/2023.findings-emnlp.129.pdf

PDF Search