Jordan Koontz


Evaluating Data Augmentation for Medication Identification in Clinical Notes
Jordan Koontz | Maite Oronoz | Alicia Pérez
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

We evaluate the effectiveness of using data augmentation to improve the generalizability of a Named Entity Recognition model for the task of medication identification in clinical notes. We compare disparate data augmentation methods, namely mention-replacement and a generative model, for creating synthetic training examples. Through experiments on the n2c2 2022 Track 1 Contextualized Medication Event Extraction data set, we show that data augmentation with supplemental examples created with GPT-3 can boost the performance of a transformer-based model for small training sets.