Abstract
In the field of Large Language Models (LLMs), researchers are increasingly exploring their effectiveness across a wide range of tasks. However, a critical area that requires further investigation is the interpretability of these models, particularly the ability to generate rational explanations for their decisions. Most existing explanation datasets are limited to the English language and the general domain, which leads to a scarcity of linguistic diversity and a lack of resources in specialized domains, such as medical. To mitigate this, we propose ExplainCPE, a challenging medical dataset consisting of over 7K problems from Chinese Pharmacist Examination, specifically tailored to assess the model-generated explanations. From the overall results, only GPT-4 passes the pharmacist examination with a 75.7% accuracy, while other models like ChatGPT fail. Further detailed analysis of LLM-generated explanations reveals the limitations of LLMs in understanding medical text and executing computational reasoning. With the increasing importance of AI safety and trustworthiness, ExplainCPE takes a step towards improving and evaluating the interpretability of LLMs in the medical domain.- Anthology ID:
- 2023.findings-emnlp.129
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1922–1940
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.129
- DOI:
- 10.18653/v1/2023.findings-emnlp.129
- Cite (ACL):
- Dongfang Li, Jindi Yu, Baotian Hu, Zhenran Xu, and Min Zhang. 2023. ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 1922–1940, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination (Li et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/2023.findings-emnlp.129.pdf