CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models

Jiaxu Zhao, Meng Fang, Zijing Shi, Yitong Li, Ling Chen, Mykola Pechenizkiy


Abstract
redWarning: This paper contains content that may be offensive or upsetting.Pretrained conversational agents have been exposed to safety issues, exhibiting a range of stereotypical human biases such as gender bias. However, there are still limited bias categories in current research, and most of them only focus on English. In this paper, we introduce a new Chinese dataset, CHBias, for bias evaluation and mitigation of Chinese conversational language models.Apart from those previous well-explored bias categories, CHBias includes under-explored bias categories, such as ageism and appearance biases, which received less attention. We evaluate two popular pretrained Chinese conversational models, CDial-GPT and EVA2.0, using CHBias. Furthermore, to mitigate different biases, we apply several debiasing methods to the Chinese pretrained models. Experimental results show that these Chinese pretrained models are potentially risky for generating texts that contain social biases, and debiasing methods using the proposed dataset can make response generation less biased while preserving the models’ conversational capabilities.
Anthology ID:
2023.acl-long.757
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13538–13556
Language:
URL:
https://aclanthology.org/2023.acl-long.757
DOI:
10.18653/v1/2023.acl-long.757
Bibkey:
Cite (ACL):
Jiaxu Zhao, Meng Fang, Zijing Shi, Yitong Li, Ling Chen, and Mykola Pechenizkiy. 2023. CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13538–13556, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models (Zhao et al., ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.acl-long.757.pdf
Video:
 https://preview.aclanthology.org/emnlp-22-attachments/2023.acl-long.757.mp4