EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations

Haoqin Sun, Jinghua Zhao, Xuechen Wang, Shiwan Zhao, Jiaming Zhou, Hui Wang, Xi Yang, Yequan Wang, Yonghua Lin


Abstract
The advancement of Multimodal Emotion Recognition (MER) in Chinese is significantly hindered by the scarcity of high-quality, spontaneous dialogue datasets compared to their English counterparts. In this work, we introduce EmotionTalk, the first interactive Chinese multimodal dataset designed to capture the nuance of authentic emotional interplay. Collected from 19 professional actors, the dataset spans 23.6 hours of dyadic conversations across diverse scenarios. A key contribution of EmotionTalk is its multi-grained annotation system, which integrates standard categorical and dimensional labels with fine-grained emotional speaking style captions, enabling research into interpretable emotion analysis. We establish comprehensive benchmarks for emotion recognition and captioning tasks, verifying the dataset’s effectiveness and the necessity of multimodal fusion. EmotionTalk serves as a critical resource for bridging the gap in non-English affective computing and is publicly released for the research community.
Anthology ID:
2026.findings-acl.440
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9054–9071
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.440/
DOI:
Bibkey:
Cite (ACL):
Haoqin Sun, Jinghua Zhao, Xuechen Wang, Shiwan Zhao, Jiaming Zhou, Hui Wang, Xi Yang, Yequan Wang, and Yonghua Lin. 2026. EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9054–9071, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations (Sun et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.440.pdf
Checklist:
 2026.findings-acl.440.checklist.pdf