Abstract
This paper describes our approach for 11 classification tasks (Task1a, Task2a, Task2b, Task3a, Task3b, Task4, Task5, Task6, Task7, Task8 and Task9) from Social Media Mining for Health (SMM4H) 2022 Shared Tasks. We developed a classification model that incorporated Rdrop to augment data and avoid overfitting, Poly Loss and Focal Loss to alleviate sample imbalance, and pseudo labels to improve model performance. The results of our submissions are over or equal to the median scores in almost all tasks. In addition, our model achieved the highest score in Task4, with a higher 7.8% and 5.3% F1-score than the median scores in Task2b and Task3a respectively.- Anthology ID:
- 2022.smm4h-1.28
- Volume:
- Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Graciela Gonzalez-Hernandez, Davy Weissenbacher
- Venue:
- SMM4H
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 98–102
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.smm4h-1.28/
- DOI:
- Cite (ACL):
- Yan Zhuang and Yanru Zhang. 2022. Yet@SMM4H’22: Improved BERT-based classification models with Rdrop and PolyLoss. In Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task, pages 98–102, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Cite (Informal):
- Yet@SMM4H’22: Improved BERT-based classification models with Rdrop and PolyLoss (Zhuang & Zhang, SMM4H 2022)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2022.smm4h-1.28.pdf