HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models

Yizhi Li; Ge Zhang; Bohao Yang; Chenghua Lin; Anton Ragni; Shi Wang; Jie Fu

HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models

Yizhi Li, Ge Zhang, Bohao Yang, Chenghua Lin, Anton Ragni, Shi Wang, Jie Fu

Abstract

Fairness has become a trending topic in natural language processing (NLP) and covers biases targeting certain social groups such as genders and religions. Yet regional bias, another long-standing global discrimination problem, remains unexplored still. Consequently, we intend to provide a study to analyse the regional bias learned by the pre-trained language models (LMs) that are broadly used in NLP tasks. While verifying the existence of regional bias in LMs, we find that the biases on regional groups can be largely affected by the corresponding geographical clustering. We accordingly propose a hierarchical regional bias evaluation method (HERB) utilising the information from the sub-region clusters to quantify the bias in the pre-trained LMs. Experiments show that our hierarchical metric can effectively evaluate the regional bias with regard to comprehensive topics and measure the potential regional bias that can be propagated to downstream tasks. Our codes are available at https://github.com/Bernard-Yang/HERB.

Anthology ID:: 2022.findings-aacl.32
Volume:: Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Month:: November
Year:: 2022
Address:: Online only
Editors:: Yulan He, Heng Ji, Sujian Li, Yang Liu, Chua-Hui Chang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 334–346
Language:
URL:: https://aclanthology.org/2022.findings-aacl.32
DOI:
Bibkey:
Cite (ACL):: Yizhi Li, Ge Zhang, Bohao Yang, Chenghua Lin, Anton Ragni, Shi Wang, and Jie Fu. 2022. HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models. In Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, pages 334–346, Online only. Association for Computational Linguistics.
Cite (Informal):: HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models (Li et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/add_acl24_videos/2022.findings-aacl.32.pdf

PDF Search