Smatgrisene at SemEval-2020 Task 12: Offense Detection by AI - with a Pinch of Real I

Peter Juel Henrichsen, Marianne Rathje


Abstract
This paper discusses how ML based classifiers can be enhanced disproportionately by adding small amounts of qualitative linguistic knowledge. As an example we present the Danish classifier Smatgrisene, our contribution to the recent OffensEval Challenge 2020. The classifier was trained on 3000 social media posts annotated for offensiveness, supplemented by rules extracted from the reference work on Danish offensive language (Rathje 2014b). Smatgrisene did surprisingly well in the competition in spite of its extremely simple design, showing an interesting trade-off between technological muscle and linguistic intelligence. Finally, we comment on the perspectives in combining qualitative and quantitative methods for NLP.
Anthology ID:
2020.semeval-1.284
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
2140–2145
Language:
URL:
https://aclanthology.org/2020.semeval-1.284
DOI:
10.18653/v1/2020.semeval-1.284
Bibkey:
Cite (ACL):
Peter Juel Henrichsen and Marianne Rathje. 2020. Smatgrisene at SemEval-2020 Task 12: Offense Detection by AI - with a Pinch of Real I. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 2140–2145, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
Smatgrisene at SemEval-2020 Task 12: Offense Detection by AI - with a Pinch of Real I (Henrichsen & Rathje, SemEval 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2020.semeval-1.284.pdf