Casey Taylor


2026

Family health history (FHx) offers insight into a person’s health and disease risk, but it is largely held within free-text clinical formats that require processing for maximal utility of the data. The rapid deployment of ambient AI scribes and conversational agents in clinical settings necessitates evaluation on dynamic patient-clinician and patient-agent dialogs. To address this gap, we introduce two new datasets of patient FHx dialog documents designed to benchmark information extraction and entity linking. Distinct from clinician-entered datasets, patient-reported dialog data has its own semantic and content characteristics, which need to be studied for more patient-centered healthcare. We contribute a publicly available resource called FHexchange, with new annotations for family members, clinical observations, related entities, and standardized UMLS CUIs, offering the clinical NLP community a robust evaluation bed for emerging generative AI tools.