SINAI@SMM4H’22: Transformers for biomedical social media text mining in Spanish
Mariia Chizhikova, Pilar López-Úbeda, Manuel C. Díaz-Galiano, L. Alfonso Ureña-López, M. Teresa Martín-Valdivia
Abstract
This paper covers participation of the SINAI team in Tasks 5 and 10 of the Social Media Mining for Health (#SSM4H) workshop at COLING-2022. These tasks focus on leveraging Twitter posts written in Spanish for healthcare research. The objective of Task 5 was to classify tweets reporting COVID-19 symptoms, while Task 10 required identifying disease mentions in Twitter posts. The presented systems explore large RoBERTa language models pre-trained on Twitter data in the case of tweet classification task and general-domain data for the disease recognition task. We also present a text pre-processing methodology implemented in both systems and describe an initial weakly-supervised fine-tuning phase alongside with a submission post-processing procedure designed for Task 10. The systems obtained 0.84 F1-score on the Task 5 and 0.77 F1-score on Task 10.- Anthology ID:
- 2022.smm4h-1.8
- Volume:
- Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Graciela Gonzalez-Hernandez, Davy Weissenbacher
- Venue:
- SMM4H
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 27–30
- Language:
- URL:
- https://aclanthology.org/2022.smm4h-1.8
- DOI:
- Cite (ACL):
- Mariia Chizhikova, Pilar López-Úbeda, Manuel C. Díaz-Galiano, L. Alfonso Ureña-López, and M. Teresa Martín-Valdivia. 2022. SINAI@SMM4H’22: Transformers for biomedical social media text mining in Spanish. In Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task, pages 27–30, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Cite (Informal):
- SINAI@SMM4H’22: Transformers for biomedical social media text mining in Spanish (Chizhikova et al., SMM4H 2022)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2022.smm4h-1.8.pdf