TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER

Neil Torrero, Gerard Sant, Carlos Escolano


Abstract
This paper describes the submission of the TALP-UPC team to the Problem List Summarization task from the BioNLP 2023 workshop. This task consists of automatically extracting a list of health issues from the e-health medical record of a given patient. Our submission combines additional steps of data annotationwith finetuning of BERT pre-trained language models. Our experiments focus on the impact of finetuning on different datasets as well as the addition of data augmentation techniques to delay overfitting.
Anthology ID:
2023.bionlp-1.48
Volume:
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
497–502
Language:
URL:
https://aclanthology.org/2023.bionlp-1.48
DOI:
10.18653/v1/2023.bionlp-1.48
Bibkey:
Cite (ACL):
Neil Torrero, Gerard Sant, and Carlos Escolano. 2023. TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 497–502, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER (Torrero et al., BioNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.bionlp-1.48.pdf