SubmissionNumber#=%=#265 FinalPaperTitle#=%=#DeepPavlov at SemEval-2024 Task 3: Multimodal Large Language Models in Emotion Reasoning ShortPaperTitle#=%=# NumberOfPages#=%=#11 CopyrightSigned#=%=#Julia JobTitle#==# Organization#==#MIPT Abstract#==#This paper presents the solution of the DeepPavlov team for the Multimodal Sentiment Cause Analysis competition in SemEval-2024 Task 3, Subtask 2 (Wang et al., 2024). In the evaluation leaderboard, our approach ranks 7th with an F1-score of 0.2132. Large Language Models (LLMs) are transformative in their ability to comprehend and generate human-like text. With recent advancements, Multimodal Large Language Models (MLLMs) have expanded LLM capabilities, integrating different modalities such as audio, vision, and language. Our work delves into the state-of-the-art MLLM Video-LLaMA, its associated modalities, and its application to the emotion reasoning downstream task, Multimodal Emotion Cause Analysis in Conversations (MECAC). We investigate the model's performance in several modes: zero-shot, few-shot, individual embeddings, and fine-tuned, providing insights into their limits and potential enhancements for emotion understanding. Author{1}{Firstname}#=%=#Julia Julia Author{1}{Lastname}#=%=#Belikova Author{1}{Username}#=%=#julia-bel Author{1}{Email}#=%=#belikova.iua@phystech.edu Author{1}{Affiliation}#=%=#MIPT Author{2}{Firstname}#=%=#Dmitrii Author{2}{Lastname}#=%=#Kosenko Author{2}{Username}#=%=#dimweb Author{2}{Email}#=%=#dimweb.tech@mail.ru Author{2}{Affiliation}#=%=#MIPT ========== èéáğö