2025
pdf
bib
abs
Re-ranking Using Large Language Models for Mitigating Exposure to Harmful Content on Social Media Platforms
Rajvardhan Oak
|
Muhammad Haroon
|
Claire Wonjeong Jo
|
Magdalena Wojcieszak
|
Anshuman Chhabra
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Social media platforms utilize Machine Learning (ML) and Artificial Intelligence (AI) powered recommendation algorithms to maximize user engagement, which can result in inadvertent exposure to harmful content. Current moderation efforts, reliant on classifiers trained with extensive human-annotated data, struggle with scalability and adapting to new forms of harm. To address these challenges, we propose a novel re-ranking approach using Large Language Models (LLMs) in zero-shot and few-shot settings. Our method dynamically assesses and re-ranks content sequences, effectively mitigating harmful content exposure without requiring extensive labeled data. Alongside traditional ranking metrics, we also introduce two new metrics to evaluate the effectiveness of re-ranking in reducing exposure to harmful content. Through experiments on three datasets, three models and across three configurations, we demonstrate that our LLM-based approach significantly outperforms existing proprietary moderation approaches, offering a scalable and adaptable solution for harm mitigation.
pdf
bib
abs
“Whose Side Are You On?” Estimating Ideology of Political and News Content Using Large Language Models and Few-shot Demonstration Selection
Muhammad Haroon
|
Magdalena Wojcieszak
|
Anshuman Chhabra
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
The rapid growth of social media platforms has led to concerns about radicalization, filter bubbles, and content bias. Existing approaches to classifying ideology are limited in that they require extensive human effort, the labeling of large datasets, and are not able to adapt to evolving ideological contexts. This paper explores the potential of Large Language Models (LLMs) for classifying the political ideology of online content through in-context learning (ICL). Our extensive experiments involving demonstration selection in label-balanced fashion, conducted on three datasets comprising news articles and YouTube videos, reveal that our approach significantly outperforms zero-shot and traditional supervised methods. Additionally, we evaluate the influence of metadata (e.g., content source and descriptions) on ideological classification and discuss its implications. Finally, we show how providing the source for political and non-political content influences the LLM’s classification.
2020
pdf
bib
abs
Developing a New Classifier for Automated Identification of Incivility in Social Media
Sam Davidson
|
Qiusi Sun
|
Magdalena Wojcieszak
Proceedings of the Fourth Workshop on Online Abuse and Harms
Incivility is not only prevalent on online social media platforms, but also has concrete effects on individual users, online groups, and the platforms themselves. Given the prevalence and effects of online incivility, and the challenges involved in human-based incivility detection, it is urgent to develop validated and versatile automatic approaches to identifying uncivil posts and comments. This project advances both a neural, BERT-based classifier as well as a logistic regression classifier to identify uncivil comments. The classifier is trained on a dataset of Reddit posts, which are annotated for incivility, and further expanded using a combination of labeled data from Reddit and Twitter. Our best performing model achieves an F1 of 0.802 on our Reddit test set. The final model is not only applicable across social media platforms and their distinct data structures, but also computationally versatile, and - as such - ready to be used on vast volumes of online data. All trained models and annotated data are made available to the research community.