Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources

Ninareh Mehrabi; Pei Zhou; Fred Morstatter; Jay Pujara; Xiang Ren; Aram Galstyan

doi:10.18653/v1/2021.emnlp-main.410

Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources

Ninareh Mehrabi, Pei Zhou, Fred Morstatter, Jay Pujara, Xiang Ren, Aram Galstyan

Abstract

Warning: this paper contains content that may be offensive or upsetting. Commonsense knowledge bases (CSKB) are increasingly used for various natural language processing tasks. Since CSKBs are mostly human-generated and may reflect societal biases, it is important to ensure that such biases are not conflated with the notion of commonsense. Here we focus on two widely used CSKBs, ConceptNet and GenericsKB, and establish the presence of bias in the form of two types of representational harms, overgeneralization of polarized perceptions and representation disparity across different demographic groups in both CSKBs. Next, we find similar representational harms for downstream models that use ConceptNet. Finally, we propose a filtering-based approach for mitigating such harms, and observe that our filtered-based approach can reduce the issues in both resources and models but leads to a performance drop, leaving room for future work to build fairer and stronger commonsense models.

Anthology ID:: 2021.emnlp-main.410
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5016–5033
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.emnlp-main.410/
DOI:: 10.18653/v1/2021.emnlp-main.410
Bibkey:
Cite (ACL):: Ninareh Mehrabi, Pei Zhou, Fred Morstatter, Jay Pujara, Xiang Ren, and Aram Galstyan. 2021. Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5016–5033, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources (Mehrabi et al., EMNLP 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.emnlp-main.410.pdf
Video:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2021.emnlp-main.410.mp4
Data: ConceptNet, GenericsKB

PDF Cite Search Video Fix data