Analyzing the Limits of Self-Supervision in Handling Bias in Language

Lisa Bauer, Karthik Gopalakrishnan, Spandana Gella, Yang Liu, Mohit Bansal, Dilek Hakkani-Tur


Abstract
Prompting inputs with natural language task descriptions has emerged as a popular mechanism to elicit reasonably accurate outputs from large-scale generative language models with little to no in-context supervision. This also helps gain insight into how well language models capture the semantics of a wide range of downstream tasks purely from self-supervised pre-training on massive corpora of unlabeled text. Such models have naturally also been exposed to a lot of undesirable content like racist and sexist language and there is only some work on awareness of models along these dimensions. In this paper, we define and comprehensively evaluate how well such language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing. We define three broad classes of task descriptions for these tasks: statement, question, and completion, with numerous lexical variants within each class. We study the efficacy of prompting for each task using these classes and the null task description across several decoding methods and few-shot examples. Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation. We believe our work is an important step towards unbiased language models by quantifying the limits of current self-supervision objectives at accomplishing such sociologically challenging tasks.
Anthology ID:
2022.findings-emnlp.545
Original:
2022.findings-emnlp.545v1
Version 2:
2022.findings-emnlp.545v2
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7372–7386
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.545
DOI:
10.18653/v1/2022.findings-emnlp.545
Bibkey:
Cite (ACL):
Lisa Bauer, Karthik Gopalakrishnan, Spandana Gella, Yang Liu, Mohit Bansal, and Dilek Hakkani-Tur. 2022. Analyzing the Limits of Self-Supervision in Handling Bias in Language. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 7372–7386, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Analyzing the Limits of Self-Supervision in Handling Bias in Language (Bauer et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2022.findings-emnlp.545.pdf
Video:
 https://preview.aclanthology.org/naacl-24-ws-corrections/2022.findings-emnlp.545.mp4