Abstract
Language embeds information about social, cultural, and political values people hold. Prior work has explored potentially harmful social biases encoded in Pre-trained Language Models (PLMs). However, there has been no systematic study investigating how values embedded in these models vary across cultures. In this paper, we introduce probes to study which cross-cultural values are embedded in these models, and whether they align with existing theories and cross-cultural values surveys. We find that PLMs capture differences in values across cultures, but those only weakly align with established values surveys. We discuss implications of using mis-aligned models in cross-cultural settings, as well as ways of aligning PLMs with values surveys.- Anthology ID:
- 2023.c3nlp-1.12
- Volume:
- Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Sunipa Dev, Vinodkumar Prabhakaran, David Adelani, Dirk Hovy, Luciana Benotti
- Venue:
- C3NLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 114–130
- Language:
- URL:
- https://aclanthology.org/2023.c3nlp-1.12
- DOI:
- 10.18653/v1/2023.c3nlp-1.12
- Cite (ACL):
- Arnav Arora, Lucie-aimée Kaffee, and Isabelle Augenstein. 2023. Probing Pre-Trained Language Models for Cross-Cultural Differences in Values. In Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP), pages 114–130, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Probing Pre-Trained Language Models for Cross-Cultural Differences in Values (Arora et al., C3NLP 2023)
- PDF:
- https://preview.aclanthology.org/finnlp-2volume-ingestion/2023.c3nlp-1.12.pdf