On the diminishing return of labeling clinical reports

Jean-Baptiste Lamare; Oloruntobiloba Olatunji; Li Yao

doi:10.18653/v1/2020.clinicalnlp-1.31

On the diminishing return of labeling clinical reports

Jean-Baptiste Lamare, Oloruntobiloba Olatunji, Li Yao

Abstract

Ample evidence suggests that better machine learning models may be steadily obtained by training on increasingly larger datasets on natural language processing (NLP) problems from non-medical domains. Whether the same holds true for medical NLP has by far not been thoroughly investigated. This work shows that this is indeed not always the case. We reveal the somehow counter-intuitive observation that performant medical NLP models may be obtained with small amount of labeled data, quite the opposite to the common belief, most likely due to the domain specificity of the problem. We show quantitatively the effect of training data size on a fixed test set composed of two of the largest public chest x-ray radiology report datasets on the task of abnormality classification. The trained models not only make use of the training data efficiently, but also outperform the current state-of-the-art rule-based systems by a significant margin.

Anthology ID:: 2020.clinicalnlp-1.31
Volume:: Proceedings of the 3rd Clinical Natural Language Processing Workshop
Month:: November
Year:: 2020
Address:: Online
Editors:: Anna Rumshisky, Kirk Roberts, Steven Bethard, Tristan Naumann
Venue:: ClinicalNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 280–290
Language:
URL:: https://aclanthology.org/2020.clinicalnlp-1.31
DOI:: 10.18653/v1/2020.clinicalnlp-1.31
Bibkey:
Cite (ACL):: Jean-Baptiste Lamare, Oloruntobiloba Olatunji, and Li Yao. 2020. On the diminishing return of labeling clinical reports. In Proceedings of the 3rd Clinical Natural Language Processing Workshop, pages 280–290, Online. Association for Computational Linguistics.
Cite (Informal):: On the diminishing return of labeling clinical reports (Lamare et al., ClinicalNLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp-22-attachments/2020.clinicalnlp-1.31.pdf
Video:: https://slideslive.com/38939812

PDF Search Video