@inproceedings{zakizadeh-pilehvar-2025-blind,
    title = "Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets",
    author = "Zakizadeh, Mahdi  and
      Pilehvar, Mohammad Taher",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1162/",
    pages = "22838--22851",
    ISBN = "979-8-89176-332-6",
    abstract = "Accurately measuring gender stereotypical bias in language models is a complex task with many hidden aspects. Current benchmarks have underestimated this multifaceted challenge and failed to capture the full extent of the problem. This paper examines the inconsistencies between intrinsic stereotype benchmarks. We propose that currently available benchmarks each capture only partial facets of gender stereotypes, and when considered in isolation, they provide just a fragmented view of the broader landscape of bias in language models. Using StereoSet and CrowS-Pairs as case studies, we investigated how data distribution affects benchmark results. By applying a framework from social psychology to balance the data of these benchmarks across various components of gender stereotypes, we demonstrated that even simple balancing techniques can significantly improve the correlation between different measurement approaches. Our findings underscore the complexity of gender stereotyping in language models and point to new directions for developing more refined techniques to detect and reduce bias."
}Markdown (Informal)
[Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets](https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1162/) (Zakizadeh & Pilehvar, EMNLP 2025)
ACL