TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER

Neil Torrero; Gerard Sant; Carlos Escolano

doi:10.18653/v1/2023.bionlp-1.48

TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER

Neil Torrero, Gerard Sant, Carlos Escolano

Abstract

This paper describes the submission of the TALP-UPC team to the Problem List Summarization task from the BioNLP 2023 workshop. This task consists of automatically extracting a list of health issues from the e-health medical record of a given patient. Our submission combines additional steps of data annotationwith finetuning of BERT pre-trained language models. Our experiments focus on the impact of finetuning on different datasets as well as the addition of data augmentation techniques to delay overfitting.

Anthology ID:: 2023.bionlp-1.48
Volume:: The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:: BioNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 497–502
Language:
URL:: https://aclanthology.org/2023.bionlp-1.48
DOI:: 10.18653/v1/2023.bionlp-1.48
Bibkey:
Cite (ACL):: Neil Torrero, Gerard Sant, and Carlos Escolano. 2023. TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 497–502, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER (Torrero et al., BioNLP 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/2023.bionlp-1.48.pdf

PDF Search