KUL@SMM4H’22: Template Augmented Adaptive Pre-training for Tweet Classification

Sumam Francis, Marie-Francine Moens


Abstract
This paper describes models developed for the Social Media Mining for Health (SMM4H) 2022 shared tasks. Our team participated in the first subtask that classifies tweets with Adverse Drug Effect (ADE) mentions. Our best-performing model comprises of a template augmented task adaptive pre-training and further fine-tuning on target task data. Augmentation with random prompt templates increases the amount of task-specific data to generalize the LM to the target task domain. We explore 2 pre-training strategies: Masked language modeling (MLM) and Simple contrastive pre-training (SimSCE) and the impact of adding template augmentations with these pre-training strategies. Our system achieves an F1 score of 0.433 on the test set without using supplementary resources and medical dictionaries.
Anthology ID:
2022.smm4h-1.41
Volume:
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
SMM4H
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
153–155
Language:
URL:
https://aclanthology.org/2022.smm4h-1.41
DOI:
Bibkey:
Cite (ACL):
Sumam Francis and Marie-Francine Moens. 2022. KUL@SMM4H’22: Template Augmented Adaptive Pre-training for Tweet Classification. In Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task, pages 153–155, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
KUL@SMM4H’22: Template Augmented Adaptive Pre-training for Tweet Classification (Francis & Moens, SMM4H 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nodalida-main-page/2022.smm4h-1.41.pdf