SubmissionNumber#=%=#16 FinalPaperTitle#=%=#Enhancing Stress Detection on Social Media Through Multi-Modal Fusion of Text and Synthesized Visuals ShortPaperTitle#=%=# NumberOfPages#=%=#10 CopyrightSigned#=%=#Efstathia Soufleri JobTitle#==#Postdoctoral Researcher Organization#==#Archimedes Unit, Athena Research Center 6 Artemidos and Epidaurou Str. Marousi Attiki 15125 Greece Abstract#==#Social media platforms generate an enormous volume of multi-modal data, yet stress detection research has predominantly relied on text-based analysis. In this work, we propose a novel framework that integrates textual content with synthesized visual cues to enhance stress detection. Using the generative model DALL·E, we synthesize images from social media posts, which are then fused with text through the multi-modal capabilities of a pre-trained CLIP model. Our approach is evaluated on the Dreaddit dataset, where a classifier trained on frozen CLIP features achieves 94.90% accuracy, and full fine-tuning further improves performance to 98.41%. These results underscore the integration of synthesized visuals with textual data not only enhances stress detection but also offers a robust method over traditional text-only methods, paving the way for innovative approaches in mental health monitoring and social media analytics. Author{1}{Firstname}#=%=#Efstathia Author{1}{Lastname}#=%=#Soufleri Author{1}{Username}#=%=#esoufleri Author{1}{Email}#=%=#e.soufleri@athenarc.gr Author{1}{Affiliation}#=%=#Athena RC Author{2}{Firstname}#=%=#Sophia Author{2}{Lastname}#=%=#Ananiadou Author{2}{Username}#=%=#effie Author{2}{Email}#=%=#sophia.nactem@gmail.com Author{2}{Affiliation}#=%=#University of Manchester ========== èéáğö