Ole Borchardt
2023
Trigger Warnings: Bootstrapping a Violence Detector for Fan Fiction
Magdalena Wolska
|
Matti Wiegmann
|
Christopher Schröder
|
Ole Borchardt
|
Benno Stein
|
Martin Potthast
Findings of the Association for Computational Linguistics: EMNLP 2023
We present the first dataset and evaluation results on a newly defined task: assigning trigger warnings. We introduce a labeled corpus of narrative fiction from Archive of Our Own (AO3), a popular fan fiction site, and define a document-level classification task to determine whether or not to assign a trigger warning to an English story. We focus on the most commonly assigned trigger type “violence’ using the warning labels provided by AO3 authors as ground-truth labels. We trained SVM, BERT, and Longfomer models on three datasets sampled from the corpus and achieve F1 scores between 0.8 and 0.9, indicating that assigning trigger warnings for violence is feasible.
Trigger Warning Assignment as a Multi-Label Document Classification Problem
Matti Wiegmann
|
Magdalena Wolska
|
Christopher Schröder
|
Ole Borchardt
|
Benno Stein
|
Martin Potthast
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
A trigger warning is used to warn people about potentially disturbing content. We introduce trigger warning assignment as a multi-label classification task, create the Webis Trigger Warning Corpus 2022, and with it the first dataset of 1 million fanfiction works from Archive of our Own with up to 36 different warnings per document. To provide a reliable catalog of trigger warnings, we organized 41 million of free-form tags assigned by fanfiction authors into the first comprehensive taxonomy of trigger warnings by mapping them to the 36 institutionally recommended warnings. To determine the best operationalization of trigger warnings, we explore state-of-the-art multi-label models, examining the trade-off between assigning coarse- and fine-grained warnings, open- and closed-set classification, document length, and label confidence. Our models achieve micro-F1 scores of about 0.5, which reveals the difficulty of the task. Tailored representations, long input sequences, and a higher recall on rare warnings would help.
Search