VaxScope: Document-Level Structured Evidence Extraction from Immunization Systematic Reviews

Bahar Ilgen, Ebenezer Awotoro, Georges Hattab


Abstract
Systematic reviews are fundamental to evidence-based medicine, but the clinical evidence they contain is primarily expressed in unstructured text, making large-scale extraction and reuse difficult. Existing biomedical NLP methods have achieved strong performance on span-level extraction from clinical trials and abstracts; however, these approaches are insufficient for systematic reviews, where evidence is often distributed across multiple studies, sentences, and sections and must be aggregated into normalized document-level attributes. We introduce VaxScope, a benchmark dataset for document-level structured evidence extraction from immunization-related systematic reviews. VaxScope is constructed through an expert-guided semi-automatic annotation pipeline that combines automatic candidate generation with domain expert validation to ensure consistency and annotation quality. We formalize the task as document-level structured extraction, where target labels are defined at the review level and require aggregating evidence beyond isolated textual spans. We further establish baselines for document-level structured extraction using abstract-level input representations and evaluate how access to evidence-grounded contextual input improves performance over abstract-only settings. Baseline experiments show that PubMedBERT achieves the best overall performance (Avg F1: 0.850), with evidence-grounded input improving performance particularly for fields requiring distributed contextual reasoning.
Anthology ID:
2026.bionlp-1.69
Volume:
BioNLP 2026
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Kirk Roberts, Junichi Tsujii
Venues:
BioNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
853–863
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.69/
DOI:
Bibkey:
Cite (ACL):
Bahar Ilgen, Ebenezer Awotoro, and Georges Hattab. 2026. VaxScope: Document-Level Structured Evidence Extraction from Immunization Systematic Reviews. In BioNLP 2026, pages 853–863, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
VaxScope: Document-Level Structured Evidence Extraction from Immunization Systematic Reviews (Ilgen et al., BioNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.69.pdf