Mark Díaz

Also published as: Mark Diaz


Accounting for Offensive Speech as a Practice of Resistance
Mark Diaz | Razvan Amironesei | Laura Weidinger | Iason Gabriel
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Tasks such as toxicity detection, hate speech detection, and online harassment detection have been developed for identifying interactions involving offensive speech. In this work we articulate the need for a relational understanding of offensiveness to help distinguish denotative offensive speech from offensive speech serving as a mechanism through which marginalized communities resist oppressive social norms. Using examples from the queer community, we argue that evaluations of offensive speech must focus on the impacts of language use. We call this the cynic perspective– or a characteristic of language with roots in Cynic philosophy that pertains to employing offensive speech as a practice of resistance. We also explore the degree to which NLP systems may encounter limits to modeling relational context.

Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations
Aida Mostafazadeh Davani | Mark Díaz | Vinodkumar Prabhakaran
Transactions of the Association for Computational Linguistics, Volume 10

Majority voting and averaging are common approaches used to resolve annotator disagreements and derive single ground truth labels from multiple annotations. However, annotators may systematically disagree with one another, often reflecting their individual biases and values, especially in the case of subjective tasks such as detecting affect, aggression, and hate speech. Annotator disagreements may capture important nuances in such tasks that are often ignored while aggregating annotations to a single ground truth. In order to address this, we investigate the efficacy of multi-annotator models. In particular, our multi-task based approach treats predicting each annotators’ judgements as separate subtasks, while sharing a common learned representation of the task. We show that this approach yields same or better performance than aggregating labels in the data prior to training across seven different binary classification tasks. Our approach also provides a way to estimate uncertainty in predictions, which we demonstrate better correlate with annotation disagreements than traditional methods. Being able to model uncertainty is especially useful in deployment scenarios where knowing when not to make a prediction is important.


On Releasing Annotator-Level Labels and Information in Datasets
Vinodkumar Prabhakaran | Aida Mostafazadeh Davani | Mark Diaz
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop

A common practice in building NLP datasets, especially using crowd-sourced annotations, involves obtaining multiple annotator judgements on the same data instances, which are then flattened to produce a single “ground truth” label or score, through majority voting, averaging, or adjudication. While these approaches may be appropriate in certain annotation tasks, such aggregations overlook the socially constructed nature of human perceptions that annotations for relatively more subjective tasks are meant to capture. In particular, systematic disagreements between annotators owing to their socio-cultural backgrounds and/or lived experiences are often obfuscated through such aggregations. In this paper, we empirically demonstrate that label aggregation may introduce representational biases of individual and group perspectives. Based on this finding, we propose a set of recommendations for increased utility and transparency of datasets for downstream use cases.