Abstract
Many clinical informatics tasks that are based on electronic health records (EHR) need relevant patient cohorts to be selected based on findings, symptoms and diseases. Frequently, these conditions are described in radiology reports which can be retrieved using information retrieval (IR) methods. The latest of these techniques utilize neural IR models such as BERT trained on clinical text. However, these methods still lack semantic understanding of the underlying clinical conditions as well as ruled out findings, resulting in poor precision during retrieval. In this paper we combine clinical finding detection with supervised query match learning. Specifically, we use lexicon-driven concept detection to detect relevant findings in sentences. These findings are used as queries to train a Sentence-BERT (SBERT) model using triplet loss on matched and unmatched query-sentence pairs. We show that the proposed supervised training task remarkably improves the retrieval performance of SBERT. The trained model generalizes well to unseen queries and reports from different collections.- Anthology ID:
- 2022.naacl-main.253
- Volume:
- Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Editors:
- Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3457–3463
- Language:
- URL:
- https://aclanthology.org/2022.naacl-main.253
- DOI:
- 10.18653/v1/2022.naacl-main.253
- Cite (ACL):
- Luyao Shi, Tanveer Syeda-mahmood, and Tyler Baldwin. 2022. Improving Neural Models for Radiology Report Retrieval with Lexicon-based Automated Annotation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3457–3463, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- Improving Neural Models for Radiology Report Retrieval with Lexicon-based Automated Annotation (Shi et al., NAACL 2022)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2022.naacl-main.253.pdf
- Data
- MS MARCO