Probing Pre-Trained Language Models for Cross-Cultural Differences in Values

Arnav Arora, Lucie-aimée Kaffee, Isabelle Augenstein


Abstract
Language embeds information about social, cultural, and political values people hold. Prior work has explored potentially harmful social biases encoded in Pre-trained Language Models (PLMs). However, there has been no systematic study investigating how values embedded in these models vary across cultures.In this paper, we introduce probes to study which cross-cultural values are embedded in these models, and whether they align with existing theories and cross-cultural values surveys. We find that PLMs capture differences in values across cultures, but those only weakly align with established values surveys. We discuss implications of using mis-aligned models in cross-cultural settings, as well as ways of aligning PLMs with values surveys.
Anthology ID:
2023.c3nlp-1.12
Volume:
Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Venue:
C3NLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
114–130
Language:
URL:
https://aclanthology.org/2023.c3nlp-1.12
DOI:
Bibkey:
Cite (ACL):
Arnav Arora, Lucie-aimée Kaffee, and Isabelle Augenstein. 2023. Probing Pre-Trained Language Models for Cross-Cultural Differences in Values. In Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP), pages 114–130, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Probing Pre-Trained Language Models for Cross-Cultural Differences in Values (Arora et al., C3NLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/2023.c3nlp-1.12.pdf