TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER

Neil Torrero, Gerard Sant, Carlos Escolano


Abstract
This paper describes the submission of the TALP-UPC team to the Problem List Summarization task from the BioNLP 2023 workshop. This task consists of automatically extracting a list of health issues from the e-health medical record of a given patient. Our submission combines additional steps of data annotationwith finetuning of BERT pre-trained language models. Our experiments focus on the impact of finetuning on different datasets as well as the addition of data augmentation techniques to delay overfitting.
Anthology ID:
2023.bionlp-1.48
Volume:
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
497–502
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2023.bionlp-1.48/
DOI:
10.18653/v1/2023.bionlp-1.48
Bibkey:
Cite (ACL):
Neil Torrero, Gerard Sant, and Carlos Escolano. 2023. TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER. In Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 497–502, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
TALP-UPC at ProbSum 2023: Fine-tuning and Data Augmentation Strategies for NER (Torrero et al., BioNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2023.bionlp-1.48.pdf