PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels

Peyman Rostami, Vahid Rahimzadeh, Ali Adibi, Azadeh Shakery


Abstract
Stance detection identifies the viewpoint expressed in text toward a specific target, such as a political figure. While previous datasets have focused primarily on tweet-level stances from established platforms, user-level stance resources—especially on emerging platforms like Bluesky—remain scarce. User-level stance detection provides a more holistic view by considering a user’s complete posting history rather than isolated posts. We present the first stance detection dataset for the 2024 U.S. presidential election, collected from Bluesky and centered on Kamala Harris and Donald Trump. The dataset comprises 16,044 user-target stance pairs enriched with engagement metadata, interaction graphs, and user posting histories. PolitiSky24 was created using a carefully evaluated pipeline combining advanced information retrieval and large language models, which generates stance labels with supporting rationales and text spans for transparency. The labeling approach achieves 81% accuracy with scalable LLMs. This resource addresses gaps in political stance analysis through its timeliness, open-data nature, and user-level perspective. The dataset is available at https://doi.org/10.5281/zenodo.15616911.
Anthology ID:
2025.findings-emnlp.1198
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
21976–21993
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1198/
DOI:
10.18653/v1/2025.findings-emnlp.1198
Bibkey:
Cite (ACL):
Peyman Rostami, Vahid Rahimzadeh, Ali Adibi, and Azadeh Shakery. 2025. PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 21976–21993, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
PolitiSky24: U.S. Political Bluesky Dataset with User Stance Labels (Rostami et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1198.pdf
Checklist:
 2025.findings-emnlp.1198.checklist.pdf