Multi-Modal Sequence Fusion via Recursive Attention for Emotion Recognition
Rory Beard, Ritwik Das, Raymond W. M. Ng, P. G. Keerthana Gopalakrishnan, Luka Eerens, Pawel Swietojanski, Ondrej Miksik
Abstract
Natural human communication is nuanced and inherently multi-modal. Humans possess specialised sensoria for processing vocal, visual, and linguistic, and para-linguistic information, but form an intricately fused percept of the multi-modal data stream to provide a holistic representation. Analysis of emotional content in face-to-face communication is a cognitive task to which humans are particularly attuned, given its sociological importance, and poses a difficult challenge for machine emulation due to the subtlety and expressive variability of cross-modal cues. Inspired by the empirical success of recent so-called End-To-End Memory Networks and related works, we propose an approach based on recursive multi-attention with a shared external memory updated over multiple gated iterations of analysis. We evaluate our model across several large multi-modal datasets and show that global contextualised memory with gated memory update can effectively achieve emotion recognition.- Anthology ID:
- K18-1025
- Volume:
- Proceedings of the 22nd Conference on Computational Natural Language Learning
- Month:
- October
- Year:
- 2018
- Address:
- Brussels, Belgium
- Venue:
- CoNLL
- SIG:
- SIGNLL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 251–259
- Language:
- URL:
- https://aclanthology.org/K18-1025
- DOI:
- 10.18653/v1/K18-1025
- Cite (ACL):
- Rory Beard, Ritwik Das, Raymond W. M. Ng, P. G. Keerthana Gopalakrishnan, Luka Eerens, Pawel Swietojanski, and Ondrej Miksik. 2018. Multi-Modal Sequence Fusion via Recursive Attention for Emotion Recognition. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 251–259, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Multi-Modal Sequence Fusion via Recursive Attention for Emotion Recognition (Beard et al., CoNLL 2018)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/K18-1025.pdf
- Data
- CMU-MOSEI