Sent2Span: Span Detection for PICO Extraction in the Biomedical Text without Span Annotations

Shifeng Liu, Yifang Sun, Bing Li, Wei Wang, Florence T. Bourgeois, Adam G. Dunn


Abstract
The rapid growth in published clinical trials makes it difficult to maintain up-to-date systematic reviews, which require finding all relevant trials. This leads to policy and practice decisions based on out-of-date, incomplete, and biased subsets of available clinical evidence. Extracting and then normalising Population, Intervention, Comparator, and Outcome (PICO) information from clinical trial articles may be an effective way to automatically assign trials to systematic reviews and avoid searching and screening—the two most time-consuming systematic review processes. We propose and test a novel approach to PICO span detection. The major difference between our proposed method and previous approaches comes from detecting spans without needing annotated span data and using only crowdsourced sentence-level annotations. Experiments on two datasets show that PICO span detection results achieve much higher results for recall when compared to fully supervised methods with PICO sentence detection at least as good as human annotations. By removing the reliance on expert annotations for span detection, this work could be used in a human-machine pipeline for turning low-quality, crowdsourced, and sentence-level PICO annotations into structured information that can be used to quickly assign trials to relevant systematic reviews.
Anthology ID:
2021.findings-emnlp.147
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1705–1715
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.147
DOI:
10.18653/v1/2021.findings-emnlp.147
Bibkey:
Cite (ACL):
Shifeng Liu, Yifang Sun, Bing Li, Wei Wang, Florence T. Bourgeois, and Adam G. Dunn. 2021. Sent2Span: Span Detection for PICO Extraction in the Biomedical Text without Span Annotations. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1705–1715, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Sent2Span: Span Detection for PICO Extraction in the Biomedical Text without Span Annotations (Liu et al., Findings 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2021.findings-emnlp.147.pdf
Video:
 https://preview.aclanthology.org/paclic-22-ingestion/2021.findings-emnlp.147.mp4
Code
 evidence-surveillance/sent2span
Data
BLUEEBM-NLP