Sentence Contextual Encoder with BERT and BiLSTM for Automatic Classification with Imbalanced Medication Tweets

Olanrewaju Tahir Aduragba, Jialin Yu, Gautham Senthilnathan, Alexandra Crsitea


Abstract
This paper details the system description and approach used by our team for the SMM4H 2020 competition, Task 1. Task 1 targets the automatic classification of tweets that mention medication. We adapted the standard BERT pretrain-then-fine-tune approach to include an intermediate training stage with a biLSTM architecture neural network acting as a further fine-tuning stage. We were inspired by the effectiveness of within-task further pre-training and sentence encoders. We show that this approach works well for a highly imbalanced dataset. In this case, the positive class is only 0.2% of the entire dataset. Our model performed better in both F1 and precision scores compared to the mean score for all participants in the competition and had a competitive recall score.
Anthology ID:
2020.smm4h-1.31
Volume:
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
SMM4H
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
165–167
Language:
URL:
https://aclanthology.org/2020.smm4h-1.31
DOI:
Bibkey:
Cite (ACL):
Olanrewaju Tahir Aduragba, Jialin Yu, Gautham Senthilnathan, and Alexandra Crsitea. 2020. Sentence Contextual Encoder with BERT and BiLSTM for Automatic Classification with Imbalanced Medication Tweets. In Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task, pages 165–167, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
Sentence Contextual Encoder with BERT and BiLSTM for Automatic Classification with Imbalanced Medication Tweets (Aduragba et al., SMM4H 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2020.smm4h-1.31.pdf