Expert-Guided Schema-Based Structured Extraction from CONSORT Diagrams Using Vision-Language Models

Damian Stachura, Bartosz Przechera, Monika Opa?ek, Ewelina Sadowska, Ewa Borowiack, Artur Nowak


Abstract
Visual-language models (VLMs) are rapidly advancing on tasks that require visual understanding of text, tables, plots, and diagrams. Yet extracting structured information from text-heavy scientific diagrams remains challenging, as it requires not only OCR but also recovery of layout, grouping, and flow relationships. We study this problem in the context of CONSORT flow diagrams, which summarize participant screening, randomization, follow-up, and analysis in randomized controlled trials. We introduce a 200-example benchmark of PubMed Central diagrams, annotated by a biomedical team specializing in systematic literature reviews and clinical evidence extraction, and evaluate schema-constrained CONSORT extraction across proprietary and open-weight model families. Using structure-aware metrics, we compare single-pass and stepwise extraction strategies. Expert-guided single-pass extraction performs best for proprietary frontier models, with Gemini 3 Pro achieving the strongest overall results, whereas stepwise prompting improves less capable open-weight models on challenging arm-level extraction. These results offer practical deployment guidance and suggest that high-quality schema-constrained extraction is feasible, but not yet solved.
Anthology ID:
2026.bionlp-1.77
Volume:
BioNLP 2026
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
Venues:
BioNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
955–969
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.77/
DOI:
Bibkey:
Cite (ACL):
Damian Stachura, Bartosz Przechera, Monika Opa?ek, Ewelina Sadowska, Ewa Borowiack, and Artur Nowak. 2026. Expert-Guided Schema-Based Structured Extraction from CONSORT Diagrams Using Vision-Language Models. In BioNLP 2026, pages 955–969, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
Expert-Guided Schema-Based Structured Extraction from CONSORT Diagrams Using Vision-Language Models (Stachura et al., BioNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.77.pdf