Pia Pachinger
2025
A Disaggregated Dataset on English Offensiveness Containing Spans
Pia Pachinger
|
Janis Goldzycher
|
Anna M. Planitzer
|
Julia Neidhardt
|
Allan Hanbury
Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
Toxicity labels at sub-document granularity and disaggregated labels lead to more nuanced and personalized toxicity classification and facilitate analysis. We re-annotate a subset of 1983 posts of the Jigsaw Toxic Comment Classification Challenge and provide disaggregated toxicity labels and spans that identify inappropriate language and targets of toxic statements. Manual analysis shows that five annotations per instance effectively capture meaningful disagreement patterns and allow for finer distinctions between genuine disagreement and that arising from annotation error or inconsistency. Our main findings are: (1) Disagreement often stems from divergent interpretations of edge-case toxicity (2) Disagreement is especially high in cases of toxic statements involving non-human targets (3) Disagreement on whether a passage consists of inappropriate language occurs not only on inherently questionable terms, but also on words that may be inappropriate in specific contexts while remaining acceptable in others (4) Transformer-based models effectively learn from aggregated data that reduces false negative classifications by being more sensitive towards minority opinions for posts to be toxic. We publish the new annotations under the CC BY 4.0 license.
2024
AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection
Pia Pachinger
|
Janis Goldzycher
|
Anna Planitzer
|
Wojciech Kusa
|
Allan Hanbury
|
Julia Neidhardt
Findings of the Association for Computational Linguistics: ACL 2024
Model interpretability in toxicity detection greatly profits from token-level annotations. However, currently, such annotations are only available in English. We introduce a dataset annotated for offensive language detection sourced from a news forum, notable for its incorporation of the Austrian German dialect, comprising 4,562 user comments. In addition to binary offensiveness classification, we identify spans within each comment constituting vulgar language or representing targets of offensive statements. We evaluate fine-tuned Transformer models as well as large language models in a zero- and few-shot fashion. The results indicate that while fine-tuned models excel in detecting linguistic peculiarities such as vulgar dialect, large language models demonstrate superior performance in detecting offensiveness in AustroTox.
2023
Toward Disambiguating the Definitions of Abusive, Offensive, Toxic, and Uncivil Comments
Pia Pachinger
|
Allan Hanbury
|
Julia Neidhardt
|
Anna Planitzer
Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)
The definitions of abusive, offensive, toxic and uncivil comments used for annotating corpora for automated content moderation are highly intersected and researchers call for their disambiguation. We summarize the definitions of these terms as they appear in 23 papers across different fields. We compare examples given for uncivil, offensive, and toxic comments, attempting to foster more unified scientific resources. Additionally, we stress that the term incivility that frequently appears in social science literature has hardly been mentioned in the literature we analyzed that focuses on computational linguistics and natural language processing.
Search
Fix author
Co-authors
- Allan Hanbury 3
- Julia Neidhardt 3
- Janis Goldzycher 2
- Anna Planitzer 2
- Wojciech Kusa 1
- show all...