Kevin El Haddad

Also published as: Kevin El Haddad


2022

pdf bib
Proceedings of the Workshop on Smiling and Laughter across Contexts and the Life-span within the 13th Language Resources and Evaluation Conference
Chiara Mazzocconi | Kevin El Haddad | Catherine Pelachaud | Gary McKeown
Proceedings of the Workshop on Smiling and Laughter across Contexts and the Life-span within the 13th Language Resources and Evaluation Conference

2018

pdf
ASR-based Features for Emotion Recognition: A Transfer Learning Approach
Noé Tits | Kevin El Haddad | Thierry Dutoit
Proceedings of Grand Challenge and Workshop on Human Multimodal Language (Challenge-HML)

During the last decade, the applications of signal processing have drastically improved with deep learning. However areas of affecting computing such as emotional speech synthesis or emotion recognition from spoken language remains challenging. In this paper, we investigate the use of a neural Automatic Speech Recognition (ASR) as a feature extractor for emotion recognition. We show that these features outperform the eGeMAPS feature set to predict the valence and arousal emotional dimensions, which means that the audio-to-text mapping learned by the ASR system contains information related to the emotional dimensions in spontaneous speech. We also examine the relationship between first layers (closer to speech) and last layers (closer to text) of the ASR and valence/arousal.

2016

pdf
AVAB-DBS: an Audio-Visual Affect Bursts Database for Synthesis
Kevin El Haddad | Hüseyin Çakmak | Stéphane Dupont | Thierry Dutoit
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

It has been shown that adding expressivity and emotional expressions to an agent’s communication systems would improve the interaction quality between this agent and a human user. In this paper we present a multimodal database of affect bursts, which are very short non-verbal expressions with facial, vocal, and gestural components that are highly synchronized and triggered by an identifiable event. This database contains motion capture and audio data of affect bursts representing disgust, startle and surprise recorded at three different levels of arousal each. This database is to be used for synthesis purposes in order to generate affect bursts of these emotions on a continuous arousal level scale.