IRCologne at GermEval 2021: Toxicity Classification

Fabian Haak, Björn Engelmann


Abstract
In this paper, we describe the TH Köln’s submission for the Shared Task on the Identification of Toxic Comments at GermEval 2021. Toxicity is a severe and latent problem in comments in online discussions. Complex language model based methods have shown the most success in identifying toxicity. However, these approaches lack explainability and might be insensitive to domain-specific renditions of toxicity. In the scope of the GermEval 2021 toxic comment classification task (Risch et al., 2021), we employed a simple but promising combination of term-frequency-based classification and rule-based labeling to produce effective but to no lesser degree explainable toxicity predictions.
Anthology ID:
2021.germeval-1.7
Volume:
Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments
Month:
September
Year:
2021
Address:
Duesseldorf, Germany
Editors:
Julian Risch, Anke Stoll, Lena Wilms, Michael Wiegand
Venue:
GermEval
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
47–53
Language:
URL:
https://aclanthology.org/2021.germeval-1.7
DOI:
Bibkey:
Cite (ACL):
Fabian Haak and Björn Engelmann. 2021. IRCologne at GermEval 2021: Toxicity Classification. In Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments, pages 47–53, Duesseldorf, Germany. Association for Computational Linguistics.
Cite (Informal):
IRCologne at GermEval 2021: Toxicity Classification (Haak & Engelmann, GermEval 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2021.germeval-1.7.pdf