Shucheng Zhu


2022

pdf bib
Analysis of Gender Bias in Social Perception and Judgement Using Chinese Word Embeddings
Jiali Li | Shucheng Zhu | Ying Liu | Pengyuan Liu
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Gender is a construction in line with social perception and judgment. An important means of this construction is through languages. When natural language processing tools, such as word embeddings, associate gender with the relevant categories of social perception and judgment, it is likely to cause bias and harm to those groups that do not conform to the mainstream social perception and judgment. Using 12,251 Chinese word embeddings as intermedium, this paper studies the relationship between social perception and judgment categories and gender. The results reveal that these grammatical gender-neutral Chinese word embeddings show a certain gender bias, which is consistent with the mainstream society’s perception and judgment of gender. Men are judged by their actions and perceived as bad, easily-disgusted, bad-tempered and rational roles while women are judged by their appearances and perceived as perfect, either happy or sad, and emotional roles.

pdf
中文自然语言处理多任务中的职业性别偏见测量(Measurement of Occupational Gender Bias in Chinese Natural Language Processing Tasks)
Mengqing Guo (郭梦清) | Jiali Li (李加厉) | Jishun Zhao (赵继舜) | Shucheng Zhu (朱述承) | Ying Liu (刘颖) | Pengyuan Liu (刘鹏远)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“尽管悲观者认为,职场中永远不可能存在性别平等。但随着人们观念的转变,愈来愈多的人们相信,职业的选择应只与个人能力相匹配,而不应由个体的性别决定。目前已经发现自然语言处理的各个任务中都存在着职业性别偏见。但这些研究往往只针对特定的英文任务,缺乏针对中文的、综合多任务的职业性别偏见测量研究。本文基于霍兰德职业模型,从中文自然语言处理中常见的三个任务出发,测量了词向量、共指消解和文本生成中的职业性别偏见,发现不同任务中的职业性别偏见既有一定的共性,又存在着独特的差异性。总体来看,不同任务中的职业性别偏见反映了现实生活中人们对于不同性别所选择职业的刻板印象。此外,在设计不同任务的偏见测量指标时,还需要考虑如语体、词序等语言学要素的影响。”

2021

pdf
中文句子级性别无偏数据集构建及预训练语言模型的性别偏度评估(Construction of Chinese Sentence-Level Gender-Unbiased Data Set and Evaluation of Gender Bias in Pre-Training Language)
Jishun Zhao (赵继舜) | Bingjie Du (杜冰洁) | Shucheng Zhu (朱述承) | Pengyuan Liu (刘鹏远)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

自然语言处理领域各项任务中,模型广泛存在性别偏见。然而当前尚无中文性别偏见评估和消偏的相关数据集,因此无法对中文自然语言处理模型中的性别偏见进行评估。首先本文根据16对性别称谓词,从一个平面媒体语料库中筛选出性别无偏的句子,构建了一个含有20000条语句的中文句子级性别无偏数据集SlguSet。随后,本文提出了一个可衡量预训练语言模型性别偏见程度的指标,并对5种流行的预训练语言模型中的性别偏见进行评估。结果表明,中文预训练语言模型中存在不同程度的性别偏见,该文所构建数据集能够很好的对中文预训练语言模型中的性别偏见进行评估。同时,该数据集还可作为评估预训练语言模型消偏方法的数据集。

2020

pdf
伟大的男人和倔强的女人:基于语料库的形容词性别偏度历时研究(Great Males and Stubborn Females: A Diachronic Study of Corpus-Based Gendered Skewness in Chinese Adjectives)
Shucheng Zhu (朱述承) | Pengyuan Liu (刘鹏远)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

性别偏见现象是社会语言学和计算语学学者均关注的研究热点,但目前大多数研究都是基于英语的,鲜有对汉语中性别偏见现象,特别是基于形容词的研究缺乏。而形容词是衡量社会对男性和女性角色规约的有力抓手。本文首先利用调查问卷的方法,构建了一个含有466个形容词的数据集,定义性别偏度为特定形容词词义和男性或女性群体相匹配的程度,并计算了数据集中每个形容词的性别偏度。然后基于DCC语料库,研究了《人民日报》的形容词性别偏度的历时总体变化,并考察了和姓名搭配的形容词的历时变化。发现《人民日报》所使用的形容词随时间的推移整体呈现中性化趋势,但在文化大革命期间呈现非常男性化的特征,和男性姓名搭配的形容词整体呈现中性化趋势。