Fast, Accurate, and Local Conversion of MIMIC-IV to OMOP with DBT
Adam Sutton, Niko Moller-Grell, Thomas Searle, Richard Dobson
Abstract
dbt mimic omop is a free, open-source resource that converts the MIMIC-IV dataset to the Observational Medical Outcomes Partnership (OMOP) common data model (CDM) format on consumer level hardware. CDM approaches are increasingly adopted in both industry and academia due to the need for interoperability and reproducibility, including in clinical NLP tasks such as cohort selection, information extraction, and retrieval-augmented generation. The MIMIC-IV database is among the most widely used critical care research datasets, yet existing pipelines to transform it to OMOP depend on enterprise database infrastructure and complex orchestration, limiting accessibility for practitioners and resource-constrained researchers. We further integrate free-text clinical notes (195.6M clinical annotations) and chest radiographs into the OMOP note nlp and imaging extension tables, making all MIMIC-IV modalities (structured data, free-text, and imaging) accessible through a common data model. This resource generates a more comprehensive dataset than existing alternatives and is intended to be used to aid in system development, testing, and evaluation.- Anthology ID:
- 2026.bionlp-1.80
- Volume:
- BioNLP 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California
- Editors:
- Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
- Venues:
- BioNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 992–996
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.80/
- DOI:
- Cite (ACL):
- Adam Sutton, Niko Moller-Grell, Thomas Searle, and Richard Dobson. 2026. Fast, Accurate, and Local Conversion of MIMIC-IV to OMOP with DBT. In BioNLP 2026, pages 992–996, San Diego, California. Association for Computational Linguistics.
- Cite (Informal):
- Fast, Accurate, and Local Conversion of MIMIC-IV to OMOP with DBT (Sutton et al., BioNLP 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.80.pdf