DFAT: Dual-stage Fusion of Acoustic and Text feature for Speech Emotion Recognition
Nhi Nguyen Yen Truong, Sang Le Quang, Huy Tran Quang, Tri Pham Xuan, Duong Tran Ham, Binh Tran Le Hai, Tin Huynh, Kiem Hoang
- Anthology ID:
- 2025.vlsp-1.6
- Volume:
- Proceedings of the 11th International Workshop on Vietnamese Language and Speech Processing
- Month:
- October
- Year:
- 2025
- Address:
- Hanoi, Vietnam
- Editors:
- Luong Chi Mai, Nguyen Thi Minh Huyen, Nguyen Thi Thu Trang
- Venues:
- VLSP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 36–44
- Language:
- URL:
- https://preview.aclanthology.org/author-page-lei-gao-usc/2025.vlsp-1.6/
- DOI:
- Cite (ACL):
- Nhi Nguyen Yen Truong, Sang Le Quang, Huy Tran Quang, Tri Pham Xuan, Duong Tran Ham, Binh Tran Le Hai, Tin Huynh, and Kiem Hoang. 2025. DFAT: Dual-stage Fusion of Acoustic and Text feature for Speech Emotion Recognition. In Proceedings of the 11th International Workshop on Vietnamese Language and Speech Processing, pages 36–44, Hanoi, Vietnam. Association for Computational Linguistics.
- Cite (Informal):
- DFAT: Dual-stage Fusion of Acoustic and Text feature for Speech Emotion Recognition (Truong et al., VLSP 2025)
- PDF:
- https://preview.aclanthology.org/author-page-lei-gao-usc/2025.vlsp-1.6.pdf