FHexchange: Resources for Family Health History Extraction and Normalization From Consumer Dialog Sources

Michelle Nguyen, Nidhi Soley, Ayah Zirikly, João Sedoc, Casey Taylor


Abstract
Family health history (FHx) offers insight into a person’s health and disease risk, but it is largely held within free-text clinical formats that require processing for maximal utility of the data. The rapid deployment of ambient AI scribes and conversational agents in clinical settings necessitates evaluation on dynamic patient-clinician and patient-agent dialogs. To address this gap, we introduce two new datasets of patient FHx dialog documents designed to benchmark information extraction and entity linking. Distinct from clinician-entered datasets, patient-reported dialog data has its own semantic and content characteristics, which need to be studied for more patient-centered healthcare. We contribute a publicly available resource called FHexchange, with new annotations for family members, clinical observations, related entities, and standardized UMLS CUIs, offering the clinical NLP community a robust evaluation bed for emerging generative AI tools.
Anthology ID:
2026.bionlp-1.82
Volume:
BioNLP 2026
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
Venues:
BioNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1014–1028
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.82/
DOI:
Bibkey:
Cite (ACL):
Michelle Nguyen, Nidhi Soley, Ayah Zirikly, João Sedoc, and Casey Taylor. 2026. FHexchange: Resources for Family Health History Extraction and Normalization From Consumer Dialog Sources. In BioNLP 2026, pages 1014–1028, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
FHexchange: Resources for Family Health History Extraction and Normalization From Consumer Dialog Sources (Nguyen et al., BioNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.82.pdf