LT4SG@SMM4H’24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models
Dasun Athukoralage, Thushari Atapattu, Menasha Thilakaratne, Katrina Falkner
Abstract
This paper presents our approaches for the SMM4H’24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders. Our first approach involves fine-tuning a single RoBERTa-large model, while the second approach entails ensembling the results of three fine-tuned BERTweet-large models. We demonstrate that although both approaches exhibit identical performance on validation data, the BERTweet-large ensemble excels on test data. Our best-performing system achieves an F1-score of 0.938 on test data, outperforming the benchmark classifier by 1.18%.- Anthology ID:
- 2024.smm4h-1.9
- Volume:
- Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Dongfang Xu, Graciela Gonzalez-Hernandez
- Venues:
- SMM4H | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 38–41
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.smm4h-1.9/
- DOI:
- Cite (ACL):
- Dasun Athukoralage, Thushari Atapattu, Menasha Thilakaratne, and Katrina Falkner. 2024. LT4SG@SMM4H’24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models. In Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, pages 38–41, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- LT4SG@SMM4H’24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models (Athukoralage et al., SMM4H 2024)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.smm4h-1.9.pdf