PrimeX: A Dataset of Worldview, Opinion, and Explanation

Rik Koncel-Kedziorski, Brihi Joshi, Tim Paek


Abstract
As the adoption of language models advances, so does the need to better represent individual users to the model. Are there aspects of an individual’s belief system that a language model can utilize for improved alignment? Following prior research, we investigate this question in the domain of opinion prediction by developing PrimeX, a dataset of public opinion survey data from 858 US residents with two additional sources of belief information: written explanations from the respondents for why they hold specific opinions, and the Primal World Belief survey for assessing respondent worldview. We provide an extensive initial analysis of our data and show the value of belief explanations and worldview for personalizing language models. Our results demonstrate how the additional belief information in PrimeX can benefit both the NLP and psychological research communities, opening up avenues for further study.
Anthology ID:
2025.emnlp-main.1256
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
24747–24772
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1256/
DOI:
Bibkey:
Cite (ACL):
Rik Koncel-Kedziorski, Brihi Joshi, and Tim Paek. 2025. PrimeX: A Dataset of Worldview, Opinion, and Explanation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 24747–24772, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
PrimeX: A Dataset of Worldview, Opinion, and Explanation (Koncel-Kedziorski et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1256.pdf
Checklist:
 2025.emnlp-main.1256.checklist.pdf