Genan Dai


2026

Social media platforms have become critical arenas for public discourse, yet existing stance detection methods often reduce opinions to surface-level labels, overlooking the conversational evidence behind stance expressions. We introduce Conversational Stance-Cause Pair Detection (CSCPD), a new task that jointly identifies both the stance polarity and its observable contextual evidence within multi-turn conversations. To advance research in this direction, we present Cause-CSD, the first large-scale dataset for CSCPD, spanning 21,048 annotated stance-cause pairs across diverse open-domain, textual, and multimodal discussions. We further propose Stance-Cause Detection Language Model (SCD-LM), a unified language model framework that leverages explicit context reasoning and joint decoding to predict stances and their supporting causes, along with human-readable rationales. Extensive experiments demonstrate that SCD-LM achieves state-of-the-art results on both text-only and multimodal subtasks, significantly outperforming strong baselines, especially for long-range and image-grounded cause detection. Our work advances explainable stance analysis and underpins understanding of public opinion drivers in impactful online settings.
Political user-level stance detection is vital for analyzing polarization, yet progress is hindered by the scarcity of high-quality benchmarks integrating linguistic and social signals. Existing datasets, largely relying on noisy heuristic or distant supervision, limit model robustness and generalizability. To address this, we introduce TwiUSD, a large-scale, expert-annotated benchmark for political user-level stance detection with explicit social network structure. TwiUSD comprises 16,211 users and 47,757 tweets, labeled by domain experts using a protocol that integrates both user content and followee signals, ensuring high-quality annotations (kappa > 0.9). Building upon TwiUSD, we propose MRFG, a Multi-scale Relevance Filtering and Graph-aware framework that leverages large language models to filter stance-relevant followee content and adaptively routes features based on structural informativeness. This design enables robust stance prediction by jointly modeling semantic and relational cues. Extensive experiments show that MRFG significantly outperforms strong baselines, highlighting the importance of relevance filtering and structure-aware modeling.

2025

Topic evolution and stance dynamics are deeply intertwined in online social media, shaping the fragmentation and polarization of public discourse. Yet existing dynamic topic models and stance analysis approaches usually consider these processes in isolation, relying on abstractions that lack interpretability and agent-level behavioral fidelity. We present stance and topic evolution reasoning framework (SPARK), the first LLM-based multi-agent simulation framework for jointly modeling the co-evolution of topics and stances through natural language interactions. In SPARK, each agent is instantiated as an LLM persona with unique demographic and psychological traits, equipped with memory and reflective reasoning. Agents engage in daily conversations, adapt their stances, and organically introduce emergent subtopics, enabling interpretable, fine-grained simulation of discourse dynamics at scale. Experiments across five real-world domains show that SPARK captures key empirical patterns—such as rapid topic innovation in technology, domain-specific stance polarization, and the influence of personality on stance shifts and topic emergence. Our framework quantitatively reveals the bidirectional mechanisms by which stance shifts and topic evolution reinforce each other, a phenomenon rarely addressed in prior work. SPARK provides actionable insights and a scalable tool for understanding and mitigating polarization in online discourse. Code and simulation resources will be released after acceptance.