Does Twitter know your political views? POLiTweets dataset and semi-automatic method for political leaning discovery

Joanna Baran, Michał Kajstura, Maciej Ziolkowski, Krzysztof Rajda


Abstract
Every day, the world is flooded by millions of messages and statements posted on Twitter or Facebook. Social media platforms try to protect users’ personal data, but there still is a real risk of misuse, including elections manipulation. Did you know, that only 10 posts addressing important or controversial topics for society are enough to predict one’s political affiliation with a 0.85 F1-score? To examine this phenomenon, we created a novel universal method of semi-automated political leaning discovery. It relies on a heuristical data annotation procedure, which was evaluated to achieve 0.95 agreement with human annotators (counted as an accuracy metric). We also present POLiTweets - the first publicly open Polish dataset for political affiliation discovery in a multi-party setup, consisting of over 147k tweets from almost 10k Polish-writing users annotated heuristically and almost 40k tweets from 166 users annotated manually as a test set. We used our data to study the aspects of domain shift in the context of topics and the type of content writers - ordinary citizens vs. professional politicians.
Anthology ID:
2022.politicalnlp-1.8
Volume:
Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Haithem Afli, Mehwish Alam, Houda Bouamor, Cristina Blasi Casagran, Colleen Boland, Sahar Ghannay
Venue:
PoliticalNLP
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
56–61
Language:
URL:
https://aclanthology.org/2022.politicalnlp-1.8
DOI:
Bibkey:
Cite (ACL):
Joanna Baran, Michał Kajstura, Maciej Ziolkowski, and Krzysztof Rajda. 2022. Does Twitter know your political views? POLiTweets dataset and semi-automatic method for political leaning discovery. In Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences, pages 56–61, Marseille, France. European Language Resources Association.
Cite (Informal):
Does Twitter know your political views? POLiTweets dataset and semi-automatic method for political leaning discovery (Baran et al., PoliticalNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/autopr/2022.politicalnlp-1.8.pdf