Multimodal Invariant Sentiment Representation Learning
Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Ning An
Abstract
Multimodal Sentiment Analysis (MSA) integrates diverse modalities to overcome the limitations of unimodal data. However, existing MSA datasets commonly exhibit significant sentiment distribution imbalances and cross-modal sentiment conflicts, which hinder performance improvement. This paper shows that distributional discrepancies and sentiment conflicts can be incorporated into the model training to learn stable multimodal invariant sentiment representation. To this end, we propose a Multimodal Invariant Sentiment Representation Learning (MISR) method. Specifically, we first learn a stable and consistent multimodal joint representation in the latent space of Gaussian distribution based on distributional constraints Then, under invariance constraint, we further learn multimodal invariant sentiment representations from multiple distributional environments constructed by the joint representation and unimodal data, achieving robust and efficient MSA performance. Extensive experiments demonstrate that MISR significantly enhances MSA performance and achieves new state-of-the-art.- Anthology ID:
- 2025.findings-acl.761
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2025
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 14743–14755
- Language:
- URL:
- https://preview.aclanthology.org/landing_page/2025.findings-acl.761/
- DOI:
- Cite (ACL):
- Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, and Ning An. 2025. Multimodal Invariant Sentiment Representation Learning. In Findings of the Association for Computational Linguistics: ACL 2025, pages 14743–14755, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Multimodal Invariant Sentiment Representation Learning (Zhu et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/landing_page/2025.findings-acl.761.pdf