Tunazzina Islam

2025

pdf bib abs
Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy
Tunazzina Islam | Dan Goldwasser
Findings of the Association for Computational Linguistics: NAACL 2025

The widespread use of social media has led to a surge in popularity for automated methods of analyzing public opinion. Supervised methods are adept at text categorization, yet the dynamic nature of social media discussions poses a continual challenge for these techniques due to the constant shifting of the focus. On the other hand, traditional unsupervised methods for extracting themes from public discourse, such as topic modeling, often reveal overarching patterns that might not capture specific nuances. Consequently, a significant portion of research into social media discourse still depends on labor-intensive manual coding techniques and a human-in-the-loop approach, which are both time-consuming and costly. In this work, we study the problem of discovering arguments associated with a specific theme. We propose a generic **LLMs-in-the-Loop** strategy that leverages the advanced capabilities of Large Language Models (LLMs) to extract latent arguments from social media messaging. To demonstrate our approach, we apply our framework to contentious topics. We use two publicly available datasets: (1) the climate campaigns dataset of 14k Facebook ads with 25 themes and (2) the COVID-19 vaccine campaigns dataset of 9k Facebook ads with 14 themes. Additionally, we design a downstream task as stance prediction by leveraging talking points in climate debates. Furthermore, we analyze demographic targeting and the adaptation of messaging based on real-world events.

2023

pdf bib abs
Interactive Concept Learning for Uncovering Latent Themes in Large Text Collections
Maria Leonor Pacheco | Tunazzina Islam | Lyle Ungar | Ming Yin | Dan Goldwasser
Findings of the Association for Computational Linguistics: ACL 2023

Experts across diverse disciplines are often interested in making sense of large text collections. Traditionally, this challenge is approached either by noisy unsupervised techniques such as topic models, or by following a manual theme discovery process. In this paper, we expand the definition of a theme to account for more than just a word distribution, and include generalized concepts deemed relevant by domain experts. Then, we propose an interactive framework that receives and encodes expert feedback at different levels of abstraction. Our framework strikes a balance between automation and manual coding, allowing experts to maintain control of their study while reducing the manual effort required.

2022

pdf bib abs
Interactively Uncovering Latent Arguments in Social Media Platforms: A Case Study on the Covid-19 Vaccine Debate
Maria Leonor Pacheco | Tunazzina Islam | Lyle Ungar | Ming Yin | Dan Goldwasser
Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances)

Automated methods for analyzing public opinion have grown in popularity with the proliferation of social media. While supervised methods can be very good at classifying text, the dynamic nature of social media discourse results in a moving target for supervised learning. Meanwhile, traditional unsupervised techniques for extracting themes from textual repositories, such as topic models, can result in incorrect outputs that are unusable to domain experts. For this reason, a non-trivial amount of research on social media discourse still relies on manual coding techniques. In this paper, we present an interactive, humans-in-the-loop framework that strikes a balance between unsupervised techniques and manual coding for extracting latent arguments from social media discussions. We use the COVID-19 vaccination debate as a case study, and show that our methodology can be used to obtain a more accurate, interpretable set of arguments when compared to traditional topic models. We do this at a relatively low manual cost, as 3 experts take approximately 2 hours to code close to 100k tweets.

pdf bib abs
A Holistic Framework for Analyzing the COVID-19 Vaccine Debate
Maria Leonor Pacheco | Tunazzina Islam | Monal Mahajan | Andrey Shor | Ming Yin | Lyle Ungar | Dan Goldwasser
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The Covid-19 pandemic has led to infodemic of low quality information leading to poor health decisions. Combating the outcomes of this infodemic is not only a question of identifying false claims, but also reasoning about the decisions individuals make. In this work we propose a holistic analysis framework connecting stance and reason analysis, and fine-grained entity level moral sentiment analysis. We study how to model the dependencies between the different level of analysis and incorporate human insights into the learning process. Experiments show that our framework provides reliable predictions even in the low-supervision settings.

Co-authors

Andrey Shor 1

Venues

Fix data