A Deterministic Multi-Stage Retrieval Pipeline for Longitudinal EHR Question Answering
Shubham Agarwal, Thomas Searle, Richard Dobson, Ninoslav Majkic, Niko Moller-Grell
Abstract
Retrieval-augmented generation (RAG) holds promise for clinical question answering over electronic health records (EHRs), but existing systems treat retrieval as an opaque subroutine, limiting auditability and reliability in patient care workflows. We introduce a deterministic multi-stage retrieval pipeline for longitudinal EHR question answering that decomposes retrieval into four distinct, ablated stages where each stage is instrumented with diagnostic metrics, making the flow of clinical evidence measurable and auditable at every step. Evaluated on a broad LLM-annotated cohort and an expert-annotated cardiovascular benchmark developed alongside clinicians from real ICU records, the full pipeline achieves 22-23% relative recall gain over a strong dense retrieval baseline across both cohorts, with consistent improvements in downstream answer quality. The pipeline’s deterministic and transparent design addresses a critical gap in clinical NLP: retrieval systems that clinicians and researchers can not only rely on, but inspect, audit, and build upon for real-world deployment.- Anthology ID:
- 2026.bionlp-1.53
- Volume:
- BioNLP 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California
- Editors:
- Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
- Venues:
- BioNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 665–678
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.53/
- DOI:
- Cite (ACL):
- Shubham Agarwal, Thomas Searle, Richard Dobson, Ninoslav Majkic, and Niko Moller-Grell. 2026. A Deterministic Multi-Stage Retrieval Pipeline for Longitudinal EHR Question Answering. In BioNLP 2026, pages 665–678, San Diego, California. Association for Computational Linguistics.
- Cite (Informal):
- A Deterministic Multi-Stage Retrieval Pipeline for Longitudinal EHR Question Answering (Agarwal et al., BioNLP 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.53.pdf