Abdullah Al Monsur
Also published as: Abdullah Al Monsur
2025
BanStereoSet: A Dataset to Measure Stereotypical Social Biases in LLMs for Bangla
Mahammed Kamruzzaman
|
Abdullah Al Monsur
|
Shrabon Kumar Das
|
Enamul Hassan
|
Gene Louis Kim
Findings of the Association for Computational Linguistics: ACL 2025
This study presents ***BanStereoSet***, a dataset designed to evaluate stereotypical social biases in multilingual LLMs for the Bangla language. In an effort to extend the focus of bias research beyond English-centric datasets, we have localized the content from the StereoSet, IndiBias, and kamruzzaman-etal’s datasets, producing a resource tailored to capture biases prevalent within the Bangla-speaking community. Our BanStereoSet dataset consists of 1,194 sentences spanning 9 categories of bias: race, profession, gender, ageism, beauty, beauty in profession, region, caste, and religion. This dataset not only serves as a crucial tool for measuring bias in multilingual LLMs but also facilitates the exploration of stereotypical bias across different social categories, potentially guiding the development of more equitable language technologies in *Bangladeshi* contexts. Our analysis of several language models using this dataset indicates significant biases, reinforcing the necessity for culturally and linguistically adapted datasets to develop more equitable language technologies.
From Anger to Joy: How Nationality Personas Shape Emotion Attribution in Large Language Models
Mahammed Kamruzzaman
|
Abdullah Al Monsur
|
Gene Louis Kim
|
Anshuman Chhabra
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Emotions are a fundamental facet of human experience, varying across individuals, cultural contexts, and nationalities. Given the recent success of Large Language Models (LLMs) as role-playing agents, we examine whether LLMs exhibit emotional stereotypes when assigned nationality-specific personas. Specifically, we investigate how different countries are represented in pre-trained LLMs through emotion attributions and whether these attributions align with cultural norms. To provide a deeper interpretive lens, we incorporate four key cultural dimensions, namely Power Distance, Uncertainty Avoidance, Long-Term Orientation, and Individualism, derived from Hofstede’s cross-cultural framework. Our analysis reveals significant nationality-based differences, with emotions such as shame, fear, and joy being disproportionately assigned across regions. Furthermore, we observe notable misalignment between LLM-generated and human emotional responses, particularly for negative emotions, highlighting the presence of reductive and potentially biased stereotypes in LLM outputs.