CompLx@SMM4H’22: In-domain pretrained language models for detection of adverse drug reaction mentions in English tweets

Orest Xherija, Hojoon Choi


Abstract
The paper describes the system that team CompLx developed for sub-task 1a of the Social Media Mining for Health 2022 (#SMM4H) Shared Task. We finetune a RoBERTa model, a pretrained, transformer-based language model, on a provided dataset to classify English tweets for mentions of Adverse Drug Reactions (ADRs), i.e. negative side effects related to medication intake. With only a simple finetuning, our approach achieves competitive results, significantly outperforming the average score across submitted systems. We make the model checkpoints and code publicly available. We also create a web application to provide a user-friendly, readily accessible interface for anyone interested in exploring the model’s capabilities.
Anthology ID:
2022.smm4h-1.47
Volume:
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Graciela Gonzalez-Hernandez, Davy Weissenbacher
Venue:
SMM4H
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
176–181
Language:
URL:
https://aclanthology.org/2022.smm4h-1.47
DOI:
Bibkey:
Cite (ACL):
Orest Xherija and Hojoon Choi. 2022. CompLx@SMM4H’22: In-domain pretrained language models for detection of adverse drug reaction mentions in English tweets. In Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task, pages 176–181, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
CompLx@SMM4H’22: In-domain pretrained language models for detection of adverse drug reaction mentions in English tweets (Xherija & Choi, SMM4H 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.smm4h-1.47.pdf
Data
GLUE