Zehui Wu


2025

pdf bib
Beyond Silent Letters: Amplifying LLMs in Emotion Recognition with Vocal Nuances
Zehui Wu | Ziwei Gong | Lin Ai | Pengyuan Shi | Kaan Donbekci | Julia Hirschberg
Findings of the Association for Computational Linguistics: NAACL 2025

pdf bib
Akan Cinematic Emotions (ACE): A Multimodal Multi-party Dataset for Emotion Recognition in Movie Dialogues
David Sasu | Zehui Wu | Ziwei Gong | Run Chen | Pengyuan Shi | Lin Ai | Julia Hirschberg | Natalie Schluter
Findings of the Association for Computational Linguistics: ACL 2025

In this paper, we introduce the Akan Cinematic Emotions (AkaCE) dataset, the first multimodal emotion dialogue dataset for an African language, addressing the significant lack of resources for low-resource languages in emotion recognition research. AkaCE, developed for the Akan language, contains 385 emotion-labeled dialogues and 6162 utterances across audio, visual, and textual modalities, along with word-level prosodic prominence annotations. The presence of prosodic labels in this dataset also makes it the first prosodically annotated African language dataset. We demonstrate the quality and utility of AkaCE through experiments using state-of-the-art emotion recognition methods, establishing solid baselines for future research. We hope AkaCE inspires further work on inclusive, linguistically and culturally diverse NLP resources.

2024

pdf bib
Multimodal Multi-loss Fusion Network for Sentiment Analysis
Zehui Wu | Ziwei Gong | Jaywon Koo | Julia Hirschberg
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

This paper investigates the optimal selection and fusion of feature encoders across multiple modalities and combines these in one neural network to improve sentiment detection. We compare different fusion methods and examine the impact of multi-loss training within the multi-modality fusion network, identifying surprisingly important findings relating to subnet performance. We have also found that integrating context significantly enhances model performance. Our best model achieves state-of-the-art performance for three datasets (CMU-MOSI, CMU-MOSEI and CH-SIMS). These results suggest a roadmap toward an optimized feature selection and fusion approach for enhancing sentiment detection in neural networks.