Towards Reliable Generation of Clinical Chart Items: A Counterfactual Reasoning Approach with Large Language Models
Jiaxuan Li, Saed Rezayi, Peter Baldwin, Polina Harik, Victoria Yaneva
Abstract
This study explores GPT-4 for generating clinical chart items in medical education using three prompting strategies. Expert evaluations found many items usable or promising. The counterfactual approach enhanced novelty, and item quality improved with high-surprisal examples. This is the first investigation of LLMs for automated clinical chart item generation.- Anthology ID:
- 2025.aimecon-main.16
- Volume:
- Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers
- Month:
- October
- Year:
- 2025
- Address:
- Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States
- Editors:
- Joshua Wilson, Christopher Ormerod, Magdalen Beiting Parrish
- Venue:
- AIME-Con
- SIG:
- Publisher:
- National Council on Measurement in Education (NCME)
- Note:
- Pages:
- 142–153
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.aimecon-main.16/
- DOI:
- Cite (ACL):
- Jiaxuan Li, Saed Rezayi, Peter Baldwin, Polina Harik, and Victoria Yaneva. 2025. Towards Reliable Generation of Clinical Chart Items: A Counterfactual Reasoning Approach with Large Language Models. In Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers, pages 142–153, Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States. National Council on Measurement in Education (NCME).
- Cite (Informal):
- Towards Reliable Generation of Clinical Chart Items: A Counterfactual Reasoning Approach with Large Language Models (Li et al., AIME-Con 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.aimecon-main.16.pdf