Salvatore Ruggieri


2022

pdf
Bias Discovery within Human Raters: A Case Study of the Jigsaw Dataset
Marta Marchiori Manerba | Riccardo Guidotti | Lucia Passaro | Salvatore Ruggieri
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022

Understanding and quantifying the bias introduced by human annotation of data is a crucial problem for trustworthy supervised learning. Recently, a perspectivist trend has emerged in the NLP community, focusing on the inadequacy of previous aggregation schemes, which suppose the existence of single ground truth. This assumption is particularly problematic for sensitive tasks involving subjective human judgments, such as toxicity detection. To address these issues, we propose a preliminary approach for bias discovery within human raters by exploring individual ratings for specific sensitive topics annotated in the texts. Our analysis’s object consists of the Jigsaw dataset, a collection of comments aiming at challenging online toxicity identification.