VaxScope: Document-Level Structured Evidence Extraction from Immunization Systematic Reviews

Bahar İlgen; Ebenezer Awotoro; Georges Hattab

VaxScope: Document-Level Structured Evidence Extraction from Immunization Systematic Reviews

Bahar Ilgen, Ebenezer Awotoro, Georges Hattab

Abstract

Systematic reviews are fundamental to evidence-based medicine, but the clinical evidence they contain is primarily expressed in unstructured text, making large-scale extraction and reuse difficult. Existing biomedical NLP methods have achieved strong performance on span-level extraction from clinical trials and abstracts; however, these approaches are insufficient for systematic reviews, where evidence is often distributed across multiple studies, sentences, and sections and must be aggregated into normalized document-level attributes. We introduce VaxScope, a benchmark dataset for document-level structured evidence extraction from immunization-related systematic reviews. VaxScope is constructed through an expert-guided semi-automatic annotation pipeline that combines automatic candidate generation with domain expert validation to ensure consistency and annotation quality. We formalize the task as document-level structured extraction, where target labels are defined at the review level and require aggregating evidence beyond isolated textual spans. We further establish baselines for document-level structured extraction using abstract-level input representations and evaluate how access to evidence-grounded contextual input improves performance over abstract-only settings. Baseline experiments show that PubMedBERT achieves the best overall performance (Avg F1: 0.850), with evidence-grounded input improving performance particularly for fields requiring distributed contextual reasoning.

Anthology ID:: 2026.bionlp-1.69
Volume:: BioNLP 2026
Month:: July
Year:: 2026
Address:: San Diego, California
Editors:: Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
Venues:: BioNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 853–863
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.69/
DOI:
Bibkey:
Cite (ACL):: Bahar Ilgen, Ebenezer Awotoro, and Georges Hattab. 2026. VaxScope: Document-Level Structured Evidence Extraction from Immunization Systematic Reviews. In BioNLP 2026, pages 853–863, San Diego, California. Association for Computational Linguistics.
Cite (Informal):: VaxScope: Document-Level Structured Evidence Extraction from Immunization Systematic Reviews (Ilgen et al., BioNLP 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.69.pdf

PDF Cite Search Fix data