Shadi Rezapour


2024

pdf
Identifying Narrative Content in Podcast Transcripts
Yosra Abdessamed | Shadi Rezapour | Steven Wilson
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

As one of the oldest forms of human communication, narratives appear across a variety of genres and media. Computational methods have been applied to study narrativity in novels, social media, and patient records, leading to new approaches and insights. However, other types of media are growing in popularity, like podcasts. Podcasts contain a multitude of spoken narratives that can provide a meaningful glimpse into how people share stories with one another.In this paper, we outline and apply methods to process English-language podcast transcripts and extract narrative content from conversations within each episode. We provide an initial analysis of the types of narrative content that exists within a wide range of podcasts, and compare our results to other established narrative analysis tools.Our annotations for narrativity and pretrained models can help to enable future research into narrativity within a large corpus of approximately 100,000 podcast episodes.

pdf
Decoding the Narratives: Analyzing Personal Drug Experiences Shared on Reddit
Layla Bouzoubaa | Elham Aghakhani | Max Song | Quang Trinh | Shadi Rezapour
Findings of the Association for Computational Linguistics: ACL 2024

Online communities such as drug-related subreddits serve as safe spaces for people who use drugs (PWUD), fostering discussions on substance use experiences, harm reduction, and addiction recovery. Users’ shared narratives on these forums provide insights into the likelihood of developing a substance use disorder (SUD) and recovery potential. Our study aims to develop a multi-level, multi-label classification model to analyze online user-generated texts about substance use experiences. For this purpose, we first introduce a novel taxonomy to assess the nature of posts, including their intended connections (Inquisition or Disclosure), subjects (e.g., Recovery, Dependency), and specific objectives (e.g., Relapse, Quality, Safety). Using various multi-label classification algorithms on a set of annotated data, we show that GPT-4, when prompted with instructions, definitions, and examples, outperformed all other models. We apply this model to label an additional 1,000 posts and analyze the categories of linguistic expression used within posts in each class. Our analysis shows that topics such as Safety, Combination of Substances, and Mental Health see more disclosure, while discussions about physiological Effects focus on harm reduction. Our work enriches the understanding of PWUD’s experiences and informs the broader knowledge base on SUD and drug use.