CReSE: Benchmark Data and Automatic Evaluation Framework for Recommending Eligibility Criteria from Clinical Trial Information

Siun Kim, Jung-Hyun Won, David Lee, Renqian Luo, Lijun Wu, Tao Qin, Howard Lee


Abstract
Eligibility criteria (EC) refer to a set of conditions an individual must meet to participate in a clinical trial, defining the study population and minimizing potential risks to patients. Previous research in clinical trial design has been primarily focused on searching for similar trials and generating EC within manual instructions, employing similarity-based performance metrics, which may not fully reflect human judgment. In this study, we propose a novel task of recommending EC based on clinical trial information, including trial titles, and introduce an automatic evaluation framework to assess the clinical validity of the EC recommendation model. Our new approach, known as CReSE (Contrastive learning and Rephrasing-based and Clinical Relevance-preserving Sentence Embedding), represents EC through contrastive learning and rephrasing via large language models (LLMs). The CReSE model outperforms existing language models pre-trained on the biomedical domain in EC clustering. Additionally, we have curated a benchmark dataset comprising 3.2M high-quality EC-title pairs extracted from 270K clinical trials available on ClinicalTrials.gov. The EC recommendation models achieve commendable performance metrics, with 49.0% precision@1 and 44.2% MAP@5 on our evaluation framework. We expect that our evaluation framework built on the CReSE model will contribute significantly to the development and assessment of the EC recommendation models in terms of clinical validity.
Anthology ID:
2024.findings-eacl.149
Volume:
Findings of the Association for Computational Linguistics: EACL 2024
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2243–2273
Language:
URL:
https://aclanthology.org/2024.findings-eacl.149
DOI:
Bibkey:
Cite (ACL):
Siun Kim, Jung-Hyun Won, David Lee, Renqian Luo, Lijun Wu, Tao Qin, and Howard Lee. 2024. CReSE: Benchmark Data and Automatic Evaluation Framework for Recommending Eligibility Criteria from Clinical Trial Information. In Findings of the Association for Computational Linguistics: EACL 2024, pages 2243–2273, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
CReSE: Benchmark Data and Automatic Evaluation Framework for Recommending Eligibility Criteria from Clinical Trial Information (Kim et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2024.findings-eacl.149.pdf
Note:
 2024.findings-eacl.149.note.zip
Video:
 https://preview.aclanthology.org/add_acl24_videos/2024.findings-eacl.149.mp4