SpeechTrans@SMM4H’20: Impact of Preprocessing and N-grams on Automatic Classification of Tweets That Mention Medications

Mohamed Lichouri, Mourad Abbas


Abstract
This paper describes our system developed for automatically classifying tweets that mention medications. We used the Decision Tree classifier for this task. We have shown that using some elementary preprocessing steps and TF-IDF n-grams led to acceptable classifier performance. Indeed, the F1-score recorded was 74.58% in the development phase and 63.70% in the test phase.
Anthology ID:
2020.smm4h-1.19
Volume:
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Graciela Gonzalez-Hernandez, Ari Z. Klein, Ivan Flores, Davy Weissenbacher, Arjun Magge, Karen O'Connor, Abeed Sarker, Anne-Lyse Minard, Elena Tutubalina, Zulfat Miftahutdinov, Ilseyar Alimova
Venue:
SMM4H
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
118–120
Language:
URL:
https://aclanthology.org/2020.smm4h-1.19
DOI:
Bibkey:
Cite (ACL):
Mohamed Lichouri and Mourad Abbas. 2020. SpeechTrans@SMM4H’20: Impact of Preprocessing and N-grams on Automatic Classification of Tweets That Mention Medications. In Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task, pages 118–120, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
SpeechTrans@SMM4H’20: Impact of Preprocessing and N-grams on Automatic Classification of Tweets That Mention Medications (Lichouri & Abbas, SMM4H 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-bitext-workshop/2020.smm4h-1.19.pdf