Sebastian Loftus
2025
Using LLMs and Preference Optimization for Agreement-Aware HateWiC Classification
Sebastian Loftus
|
Adrian Mülthaler
|
Sanne Hoeken
|
Sina Zarrieß
|
Ozge Alacam
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)
Annotator disagreement poses a significant challenge in subjective tasks like hate speech detection. In this paper, we introduce a novel variant of the HateWiC task that explicitly models annotator agreement by estimating the proportion of annotators who classify the meaning of a term as hateful. To tackle this challenge, we explore the use of Llama 3 models fine-tuned through Direct Preference Optimization (DPO). Our experiments show that while LLMs perform well for majority-based hate classification, they struggle with the more complex agreement-aware task. DPO fine-tuning offers improvements, particularly when applied to instruction-tuned models. Yet, our results emphasize the need for improved modeling of subjectivity in hate classification and this study can serve as foundation for future advancements.
2024
Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations
Siyao Peng
|
Zihang Sun
|
Sebastian Loftus
|
Barbara Plank
Proceedings of the Third Workshop on Understanding Implicit and Underspecified Language
Named Entity Recognition (NER) is a key information extraction task with a long-standing tradition. While recent studies address and aim to correct annotation errors via re-labeling efforts, little is known about the sources of label variation, such as text ambiguity, annotation error, or guideline divergence. This is especially the case for high-quality datasets and beyond English CoNLL03. This paper studies disagreements in expert-annotated named entity datasets for three varieties: English, Danish, and Bavarian. We show that text ambiguity and artificial guideline changes are dominant factors for diverse annotations among high-quality revisions. We survey student annotations on a subset of difficult entities and substantiate the feasibility and necessity of manifold annotations for understanding named entity ambiguities from a distributional perspective.
Search
Fix author
Co-authors
- Özge Alaçam 1
- Sanne Hoeken 1
- Adrian Mülthaler 1
- Siyao Peng 1
- Barbara Plank 1
- show all...