AmchiBias: Measuring Stereotypical Bias in Goan Identity Groups with a Minimal Pair Dataset in English and Konkani

Michelle Barbosa; Sebastian Padó; Franziska Weeber

AmchiBias: Measuring Stereotypical Bias in Goan Identity Groups with a Minimal Pair Dataset in English and Konkani

Michelle Barbosa, Sebastian Padó, Franziska Weeber

Abstract

Socio-cultural stereotypical bias is an important consideration in the development and deployment of NLP systems. It is however often considered only at the national level, despite rich subnational socio-cultural structures. We present AmchiBias, the first benchmark for enmeasuring socio-cultural stereotypical bias for the Indian state of Goa with its unique historically multicultural setting. It covers various Goan identity groups and comprises 313 minimal pairs across eight sociodemographic dimensions in both English and Devanagari Konkani. We then evaluate stereotypical bias in five multilingual encoder models on this benchmark. We find near-chance scores in Konkani, reflecting language incompetence for general multilingual models and a lack of Goan cultural competence for Indian language models. Queried in English, models with a stronger Indian language coverage show higher bias for pan-Indian groups than hyperlocal Goan groups. This suggests the English signal reflects pan-Indian pretraining associations rather than genuine Goan cultural knowledge. Our findings highlight a critical gap in low-resource multilingual NLP evaluation for hyperlocal community identities.

Anthology ID:: 2026.stereacult-1.10
Volume:: Proceedings of the 1st Workshop on Stereotypes Across Cultures in Language Technologies (StereACuLT 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Weicheng Ma, Soroush Vosoughi, Nabeel Gillani, Rolando Coto-Solano
Venues:: StereACuLT | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 101–115
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.stereacult-1.10/
DOI:
Bibkey:
Cite (ACL):: Michelle Barbosa, Sebastian Padó, and Franziska Weeber. 2026. AmchiBias: Measuring Stereotypical Bias in Goan Identity Groups with a Minimal Pair Dataset in English and Konkani. In Proceedings of the 1st Workshop on Stereotypes Across Cultures in Language Technologies (StereACuLT 2026), pages 101–115, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: AmchiBias: Measuring Stereotypical Bias in Goan Identity Groups with a Minimal Pair Dataset in English and Konkani (Barbosa et al., StereACuLT 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.stereacult-1.10.pdf

PDF Cite Search Fix data