Euphemistic Abuse – A New Dataset and Classification Experiments for Implicitly Abusive Language
Michael Wiegand, Jana Kampfmeier, Elisabeth Eder, Josef Ruppenhofer
Abstract
We address the task of identifying euphemistic abuse (e.g. “You inspire me to fall asleep”) paraphrasing simple explicitly abusive utterances (e.g. “You are boring”). For this task, we introduce a novel dataset that has been created via crowdsourcing. Special attention has been paid to the generation of appropriate negative (non-abusive) data. We report on classification experiments showing that classifiers trained on previous datasets are less capable of detecting such abuse. Best automatic results are obtained by a classifier that augments training data from our new dataset with automatically-generated GPT-3 completions. We also present a classifier that combines a few manually extracted features that exemplify the major linguistic phenomena constituting euphemistic abuse.- Anthology ID:
- 2023.emnlp-main.1012
- Volume:
- Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 16280–16297
- Language:
- URL:
- https://aclanthology.org/2023.emnlp-main.1012
- DOI:
- 10.18653/v1/2023.emnlp-main.1012
- Cite (ACL):
- Michael Wiegand, Jana Kampfmeier, Elisabeth Eder, and Josef Ruppenhofer. 2023. Euphemistic Abuse – A New Dataset and Classification Experiments for Implicitly Abusive Language. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 16280–16297, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Euphemistic Abuse – A New Dataset and Classification Experiments for Implicitly Abusive Language (Wiegand et al., EMNLP 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.emnlp-main.1012.pdf