Runyang You


2025

This paper describes our system used in the SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection. To address the highly subjective nature of emotion detection tasks, we propose a model ensemble strategy designed to capture the varying subjective perceptions of different users towards textual content. The base models of this ensemble strategy consist of several large language models, which are then combined using methods such as neural networks, decision trees, linear regression, and weighted voting. In Track A, out of 28 languages, our system achieved first place in 19 languages. In Track B, out of 11 languages, our system ranked first in 10 languages. Furthermore, our system attained the highest average performance across all languages in both Track A and Track B.
Understanding idioms in multimodal contexts poses significant challenges due to data scarcity, idiomatic ambiguity, and the need for effective alignment of visual and textual inputs. In this work, we introduce MIRA (Multimodal Idiom Recognition and Alignment), a training-free framework designed to address these challenges on the SemEval-2025 Task 1 (AdMIRe) benchmark. MIRA leverages powerful closed-source large language models (LLMs) and integrates three key innovations: bias correction via in-context learning, multi-step semantic-visual fusion, and a self-revision mechanism that iteratively refines its outputs through backward verification. By systematically processing and fusing multimodal inputs, MIRA generates high-quality, fine-grained image-text representations that enhance idiom comprehension across different languages and cultural contexts. Experimental evaluations in both English and Portuguese demonstrate that our approach achieves robust performance without the need for additional training, setting a new standard for multimodal idiom recognition.