Abstract
Language embeds information about social, cultural, and political values people hold. Prior work has explored potentially harmful social biases encoded in Pre-trained Language Models (PLMs). However, there has been no systematic study investigating how values embedded in these models vary across cultures.In this paper, we introduce probes to study which cross-cultural values are embedded in these models, and whether they align with existing theories and cross-cultural values surveys. We find that PLMs capture differences in values across cultures, but those only weakly align with established values surveys. We discuss implications of using mis-aligned models in cross-cultural settings, as well as ways of aligning PLMs with values surveys.- Anthology ID:
- 2023.c3nlp-1.12
- Volume:
- Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Venue:
- C3NLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 114–130
- Language:
- URL:
- https://aclanthology.org/2023.c3nlp-1.12
- DOI:
- Cite (ACL):
- Arnav Arora, Lucie-aimée Kaffee, and Isabelle Augenstein. 2023. Probing Pre-Trained Language Models for Cross-Cultural Differences in Values. In Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP), pages 114–130, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Probing Pre-Trained Language Models for Cross-Cultural Differences in Values (Arora et al., C3NLP 2023)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2023.c3nlp-1.12.pdf