Multi-Modal Sequence Fusion via Recursive Attention for Emotion Recognition

Rory Beard, Ritwik Das, Raymond W. M. Ng, P. G. Keerthana Gopalakrishnan, Luka Eerens, Pawel Swietojanski, Ondrej Miksik

[How to correct problems with metadata yourself]


Abstract
Natural human communication is nuanced and inherently multi-modal. Humans possess specialised sensoria for processing vocal, visual, and linguistic, and para-linguistic information, but form an intricately fused percept of the multi-modal data stream to provide a holistic representation. Analysis of emotional content in face-to-face communication is a cognitive task to which humans are particularly attuned, given its sociological importance, and poses a difficult challenge for machine emulation due to the subtlety and expressive variability of cross-modal cues. Inspired by the empirical success of recent so-called End-To-End Memory Networks and related works, we propose an approach based on recursive multi-attention with a shared external memory updated over multiple gated iterations of analysis. We evaluate our model across several large multi-modal datasets and show that global contextualised memory with gated memory update can effectively achieve emotion recognition.
Anthology ID:
K18-1025
Volume:
Proceedings of the 22nd Conference on Computational Natural Language Learning
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Anna Korhonen, Ivan Titov
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
251–259
Language:
URL:
https://aclanthology.org/K18-1025
DOI:
10.18653/v1/K18-1025
Bibkey:
Cite (ACL):
Rory Beard, Ritwik Das, Raymond W. M. Ng, P. G. Keerthana Gopalakrishnan, Luka Eerens, Pawel Swietojanski, and Ondrej Miksik. 2018. Multi-Modal Sequence Fusion via Recursive Attention for Emotion Recognition. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pages 251–259, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Multi-Modal Sequence Fusion via Recursive Attention for Emotion Recognition (Beard et al., CoNLL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/K18-1025.pdf
Data
CMU-MOSEI