Jim Buttery

2022

pdf abs
CHAAI@SMM4H’22: RoBERTa, GPT-2 and Sampling - An interesting concoction
Christopher Palmer | Sedigheh Khademi Habibabadi | Muhammad Javed | Gerardo Luis Dimaguila | Jim Buttery
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

This paper describes the approaches to the SMM4H 2022 Shared Tasks that were taken by our team for tasks 1 and 6. Task 6 was the “Classification of tweets which indicate self-reported COVID-19 vaccination status (in English)”. The best test F1 score was 0.82 using a CT-BERT model, which exceeded the median test F1 score of 0.77, and was close to the 0.83 F1 score of the SMM4H baseline model. Task 1 was described as the “Classification, detection and normalization of Adverse Events (AE) mentions in tweets (in English)”. We undertook task 1a, and with a RoBERTa-base model achieved an F1 Score of 0.61 on test data, which exceeded the mean test F1 for the task of 0.56.

Co-authors

Venues

smm4h1