Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels Using Statistical Sampling and Post-Processing
Manikandan Ravikiran, Amin Ekant Muljibhai, Toshinori Miyoshi, Hiroaki Ozaki, Yuta Koreeda, Sakata Masayuki
Abstract
In this paper, we present our participation in SemEval-2020 Task-12 Subtask-A (English Language) which focuses on offensive language identification from noisy labels. To this end, we developed a hybrid system with the BERT classifier trained with tweets selected using Statistical Sampling Algorithm (SA) and Post-Processed (PP) using an offensive wordlist. Our developed system achieved 34th position with Macro-averaged F1-score (Macro-F1) of 0.90913 over both offensive and non-offensive classes. We further show comprehensive results and error analysis to assist future research in offensive language identification with noisy labels.- Anthology ID:
- 2020.semeval-1.258
- Volume:
- Proceedings of the Fourteenth Workshop on Semantic Evaluation
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona (online)
- Editors:
- Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- International Committee for Computational Linguistics
- Note:
- Pages:
- 1961–1967
- Language:
- URL:
- https://aclanthology.org/2020.semeval-1.258
- DOI:
- 10.18653/v1/2020.semeval-1.258
- Cite (ACL):
- Manikandan Ravikiran, Amin Ekant Muljibhai, Toshinori Miyoshi, Hiroaki Ozaki, Yuta Koreeda, and Sakata Masayuki. 2020. Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels Using Statistical Sampling and Post-Processing. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1961–1967, Barcelona (online). International Committee for Computational Linguistics.
- Cite (Informal):
- Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels Using Statistical Sampling and Post-Processing (Ravikiran et al., SemEval 2020)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2020.semeval-1.258.pdf