UBC-NLP at SemEval-2019 Task 6: Ensemble Learning of Offensive Content With Enhanced Training Data

Arun Rajendran, Chiyu Zhang, Muhammad Abdul-Mageed


Abstract
We examine learning offensive content on Twitter with limited, imbalanced data. For the purpose, we investigate the utility of using various data enhancement methods with a host of classical ensemble classifiers. Among the 75 participating teams in SemEval-2019 sub-task B, our system ranks 6th (with 0.706 macro F1-score). For sub-task C, among the 65 participating teams, our system ranks 9th (with 0.587 macro F1-score).
Anthology ID:
S19-2136
Volume:
Proceedings of the 13th International Workshop on Semantic Evaluation
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota, USA
Editors:
Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
775–781
Language:
URL:
https://aclanthology.org/S19-2136
DOI:
10.18653/v1/S19-2136
Bibkey:
Cite (ACL):
Arun Rajendran, Chiyu Zhang, and Muhammad Abdul-Mageed. 2019. UBC-NLP at SemEval-2019 Task 6: Ensemble Learning of Offensive Content With Enhanced Training Data. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 775–781, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):
UBC-NLP at SemEval-2019 Task 6: Ensemble Learning of Offensive Content With Enhanced Training Data (Rajendran et al., SemEval 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/S19-2136.pdf