Abstract
Labeling is typically the most human-intensive step during the development of supervised learning models. In this paper, we propose a simple and easy-to-implement visualization approach that reduces cognitive load and increases the speed of text labeling. The approach is fine-tuned for task of extraction of patient smoking status from clinical notes. The proposed approach consists of the ordering of sentences that mention smoking, centering them at smoking tokens, and annotating to enhance informative parts of the text. Our experiments on clinical notes from the MIMIC-III clinical database demonstrate that our visualization approach enables human annotators to label sentences up to 3 times faster than with a baseline approach.- Anthology ID:
- 2021.dash-1.4
- Volume:
- Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Editors:
- Eduard Dragut, Yunyao Li, Lucian Popa, Slobodan Vucetic
- Venue:
- DaSH
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 24–30
- Language:
- URL:
- https://aclanthology.org/2021.dash-1.4
- DOI:
- 10.18653/v1/2021.dash-1.4
- Cite (ACL):
- Saman Enayati, Ziyu Yang, Benjamin Lu, and Slobodan Vucetic. 2021. A Visualization Approach for Rapid Labeling of Clinical Notes for Smoking Status Extraction. In Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances, pages 24–30, Online. Association for Computational Linguistics.
- Cite (Informal):
- A Visualization Approach for Rapid Labeling of Clinical Notes for Smoking Status Extraction (Enayati et al., DaSH 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2021.dash-1.4.pdf