Abstract
We examine learning offensive content on Twitter with limited, imbalanced data. For the purpose, we investigate the utility of using various data enhancement methods with a host of classical ensemble classifiers. Among the 75 participating teams in SemEval-2019 sub-task B, our system ranks 6th (with 0.706 macro F1-score). For sub-task C, among the 65 participating teams, our system ranks 9th (with 0.587 macro F1-score).- Anthology ID:
 - S19-2136
 - Volume:
 - Proceedings of the 13th International Workshop on Semantic Evaluation
 - Month:
 - June
 - Year:
 - 2019
 - Address:
 - Minneapolis, Minnesota, USA
 - Editors:
 - Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
 - Venue:
 - SemEval
 - SIG:
 - SIGLEX
 - Publisher:
 - Association for Computational Linguistics
 - Note:
 - Pages:
 - 775–781
 - Language:
 - URL:
 - https://aclanthology.org/S19-2136
 - DOI:
 - 10.18653/v1/S19-2136
 - Cite (ACL):
 - Arun Rajendran, Chiyu Zhang, and Muhammad Abdul-Mageed. 2019. UBC-NLP at SemEval-2019 Task 6: Ensemble Learning of Offensive Content With Enhanced Training Data. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 775–781, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
 - Cite (Informal):
 - UBC-NLP at SemEval-2019 Task 6: Ensemble Learning of Offensive Content With Enhanced Training Data (Rajendran et al., SemEval 2019)
 - PDF:
 - https://preview.aclanthology.org/ingest-acl-2023-videos/S19-2136.pdf