Laurence Devillers

Also published as: L. Devillers

2024

pdf abs
Linguistic Nudges and Verbal Interaction with Robots, Smart-Speakers, and Humans
Natalia Kalashnikova | Ioana Vasilescu | Laurence Devillers
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper describes a data collection methodology and emotion annotation of dyadic interactions between a human, a Pepper robot, a Google Home smart-speaker, or another human. The collected 16 hours of audio recordings were used to analyze the propensity to change someone’s opinions about ecological behavior regarding the type of conversational agent, the kind of nudges, and the speaker’s emotional state. We describe the statistics of data collection and annotation. We also report the first results, which showed that humans change their opinions on more questions with a human than with a device, even against mainstream ideas. We observe a correlation between a certain emotional state and the interlocutor and a human’s propensity to be influenced. We also reported the results of the studies that investigated the effect of human likeness on speech using our data.

2023

pdf abs
Effet de l’anthropomorphisme des machines sur le français adressé aux robots: Étude du débit de parole et de la fluence
Natalia Kalashnikova | Mathilde Hutin | Ioana Vasilescu | Laurence Devillers
Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 4 : articles déjà soumis ou acceptés en conférence internationale

“Robot-directed speech” désigne la parole adressée à un appareil robotique, des petites enceintes domestiques aux robots humanoïdes grandeur-nature. Les études passées ont analysé les propriétés phonétiques et linguistiques de ce type de parole ou encore l’effet de l’anthropomorphisme des appareils sur la sociabilité des interactions, mais l’effet de l’anthropomorphisme sur les réalisations linguistiques n’a encore jamais été exploré. Notre étude propose de combler ce manque avec l’analyse d’un paramètre phonétique (débit de parole) et d’un paramètre linguistique (fréquence des pauses remplies) sur la parole adressée à l’enceinte vs. au robot humanoïde vs. à l’humain. Les données de 71 francophones natifs indiquent que les énoncés adressés aux humains sont plus longs, plus rapides et plus dysfluents que ceux adressés à l’enceinte et au robot. La parole adressée à l’enceinte et au robot est significativement différente de la parole adressée à l’humain, mais pas l’une de l’autre, indiquant l’existence d’un type particulier de la parole adressée aux machines.

2022

pdf abs
Corpus Design for Studying Linguistic Nudges in Human-Computer Spoken Interactions
Natalia Kalashnikova | Serge Pajak | Fabrice Le Guel | Ioana Vasilescu | Gemma Serrano | Laurence Devillers
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper, we present the methodology of corpus design that will be used to study the comparison of influence between linguistic nudges with positive or negative influences and three conversational agents: robot, smart speaker, and human. We recruited forty-nine participants to form six groups. The conversational agents first asked the participants about their willingness to adopt five ecological habits and invest time and money in ecological problems. The participants were then asked the same questions but preceded by one linguistic nudge with positive or negative influence. The comparison of standard deviation and mean metrics of differences between these two notes (before the nudge and after) showed that participants were mainly affected by nudges with positive influence, even though several nudges with negative influence decreased the average note. In addition, participants from all groups were willing to spend more money than time on ecological problems. In general, our experiment’s early results suggest that a machine agent can influence participants to the same degree as a human agent. A better understanding of the power of influence of different conversational machines and the potential of influence of nudges of different polarities will lead to the development of ethical norms of human-computer interactions.

2014

pdf abs
Smile and Laughter in Human-Machine Interaction: a study of engagement
Mariette Soury | Laurence Devillers
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article presents a corpus featuring adults playing games in interaction with machine trying to induce laugh. This corpus was collected during Interspeech 2013 in Lyon to study behavioral differences correlated to different personalities and cultures. We first present the collection protocol, then the corpus obtained and finally different quantitative and qualitative measures. Smiles and laughs are types of affect bursts which are defined as short emotional non-speech expressions. Here we correlate smile and laugh with personality traits and cultural background. Our final objective is to propose a measure of engagement deduced from those affect bursts.

pdf
Détection des états affectifs lors d’interactions parlées : robustesse des indices non verbaux [Automatic in-voice affective state detection in spontaneous speech: robustness of non-verbal cues]
Laurence Devillers | Marie Tahon | Mohamed A. Sehili | Agnès Delaborde
Traitement Automatique des Langues, Volume 55, Numéro 2 : Traitement automatique du langage parlé [Spoken language processing]

2012

pdf
Détection d’émotions dans la voix de patients en interaction avec un agent conversationnel animé (Emotions detection in the voice of patients interacting with an animated conversational agent) [in French]
Clément Chastagnol | Laurence Devillers
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 1: JEP

pdf
Impact du Comportement Social d’un Robot sur les Emotions de l’Utilisateur : une Expérience Perceptive (Impact of the Social Behaviour of a Robot on the User’s Emotions: a Perceptive Experiment) [in French]
Agnes Delaborde | Laurence Devillers
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 1: JEP

pdf abs
Corpus of Children Voices for Mid-level Markers and Affect Bursts Analysis
Marie Tahon | Agnes Delaborde | Laurence Devillers
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This article presents a corpus featuring children playing games in interaction with the humanoid robot Nao: children have to express emotions in the course of a storytelling by the robot. This corpus was collected to design an affective interactive system driven by an interactional and emotional representation of the user. We evaluate here some mid-level markers used in our system: reaction time, speech duration and intensity level. We also question the presence of affect bursts, which are quite numerous in our corpus, probably because of the young age of the children and the absence of predefined lexical content.

2010

pdf abs
Building a System for Emotions Detection from Speech to Control an Affective Avatar
Mátyás Brendel | Riccardo Zaccarelli | Laurence Devillers
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we describe a corpus set together from two sub-corpora. The CINEMO corpus contains acted emotional expression obtained by playing dubbing exercises. This new protocol is a way to collect mood-induced data in large amount which show several complex and shaded emotions. JEMO is a corpus collected with an emotion-detection game and contains more prototypical emotions than CINEMO. We show how the two sub-corpora balance and enrich each other and result in a better performance. We built male and female emotion models and use Sequential Fast Forward Feature Selection to improve detection performances. After feature-selection we obtain good results even with our strict speaker independent testing method. The global corpus contains 88 speakers (38 females, 50 males). This study has been done within the scope of the ANR (National Research Agency) Affective Avatar project which deals with building a system of emotions detection for monitoring an Artificial Agent by voice.

pdf abs
CINEMO — A French Spoken Language Resource for Complex Emotions: Facts and Baselines
Björn Schuller | Riccardo Zaccarelli | Nicolas Rollet | Laurence Devillers
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The CINEMO corpus of French emotional speech provides a richly annotated resource to help overcome the apparent lack of learning and testing speech material for complex, i.e. blended or mixed emotions. The protocol for its collection was dubbing selected emotional scenes from French movies. 51 speakers are contained and the total speech time amounts to 2 hours and 13 minutes and 4k speech chunks after segmentation. Extensive labelling was carried out in 16 categories for major and minor emotions and in 6 continuous dimensions. In this contribution we give insight into the corpus statistics focusing in particular on the topic of complex emotions, and provide benchmark recognition results obtained in exemplary large feature space evaluations. In the result the labelling oft he collected speech clearly demonstrates that a complex handling of emotion seems needed. Further, the automatic recognition experiments provide evidence that the automatic recognition of blended emotions appears to be feasible.

2008

pdf abs
Coding Emotional Events in Audiovisual Corpora
Laurence Devillers | Jean-Claude Martin
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The modelling of realistic emotional behaviour is needed for various applications in multimodal human-machine interaction such as the design of emotional conversational agents (Martin et al., 2005) or of emotional detection systems (Devillers and Vidrascu, 2007). Yet, building such models requires appropriate definition of various levels for representing the emotions themselves but also some contextual information such as the events that elicit these emotions. This paper presents a coding scheme that has been defined following annotations of a corpus of TV interviews (EmoTV). Deciding which events triggered or may trigger which emotion is a challenge for building efficient emotion eliciting protocols. In this paper, we present the protocol that we defined for collecting another corpus of spontaneous human-human interactions recorded in laboratory conditions (EmoTaboo). We discuss the events that we designed for eliciting emotions. Part of this scheme for coding emotional event is being included in the specifications that are currently defined by a working group of the W3C (the W3C Emotion Incubator Working group). This group is investigating the feasibility of working towards a standard representation of emotions and related states in technological contexts.

2007

pdf
SIMDIAL - Un paradigme pour évaluer automatiquement des systèmes de dialogue homme-machine en simulant un utilisateur de façon déterministe [SIMDIAL - A paradigm for the automatic evaluation of human-machine dialogue systems by deterministic simulation of a user]
Joseph Allemandou | Laurent Charnay | Laurence Devillers | Muriel Lauvergne | Joseph Mariani
Traitement Automatique des Langues, Volume 48, Numéro 1 : Principes de l'évaluation en Traitement Automatique des Langues [Principles of Evaluation in Natural Language Processing]

2006

pdf abs
Fear-type emotions of the SAFE Corpus: annotation issues
Chloé Clavel | Ioana Vasilescu | Laurence Devillers | Thibaut Ehrette | Gaël Richard
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The present research focuses on annotation issues in the context of the acoustic detection of fear-type emotions for surveillance applications. The emotional speech material used for this study comes from the previously collected SAFE Database (Situation Analysis in a Fictional and Emotional Database) which consists of audio-visual sequences extracted from movie fictions. A generic annotation scheme was developed to annotate the various emotional manifestations contained in the corpus. The annotation was carried out by two labellers and the two annotations strategies are confronted. It emerges that the borderline between emotion and neutral vary according to the labeller. An acoustic validation by a third labeller allows at analysing the two strategies. Two human strategies are then observed: a first one, context-oriented which mixes audio and contextual (video) information in emotion categorization; and a second one, based mainly on audio information. The k-means clustering confirms the role of audio cues in human annotation strategies. It particularly helps in evaluating those strategies from the point of view of a detection system based on audio cues.

pdf abs
Real life emotions in French and English TV video clips: an integrated annotation protocol combining continuous and discrete approaches
L. Devillers | R. Cowie | J-C. Martin | E. Douglas-Cowie | S. Abrilian | M. McRorie
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

A major barrier to the development of accurate and realistic models of human emotions is the absence of multi-cultural / multilingual databases of real-life behaviours and of a federative and reliable annotation protocol. QUB and LIMSI teams are working towards the definition of an integrated coding scheme combining their complementary approaches. This multilevel integrated scheme combines the dimensions that appear to be useful for the study of real-life emotions: verbal labels, abstract dimensions and contextual (appraisal based) annotations. This paper describes this integrated coding scheme, a protocol that was set-up for annotating French and English video clips of emotional interviews and the results (e.g. inter-coder agreement measures and subjective evaluation of the scheme).

pdf abs
Manual Annotation and Automatic Image Processing of Multimodal Emotional Behaviours: Validating the Annotation of TV Interviews
J.-C. Martin | G. Caridakis | L. Devillers | K. Karpouzis | S. Abrilian
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

There has been a lot of psychological researches on emotion and nonverbal communication. Yet, these studies were based mostly on acted basic emotions. This paper explores how manual annotation and image processing can cooperate towards the representation of spontaneous emotional behaviour in low resolution videos from TV. We describe a corpus of TV interviews and the manual annotations that have been defined. We explain the image processing algorithms that have been designed for the automatic estimation of movement quantity. Finally, we explore how image processing can be used for the validation of manual annotations.

pdf abs
Annotation of Emotions in Real-Life Video Interviews: Variability between Coders
S. Abrilian | L. Devillers | J-C. Martin
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Research on emotional real-life data has to tackle the problem of their annotation. The annotation of emotional corpora raises the issue of how different coders perceive the same multimodal emotional behaviour. The long-term goal of this paper is to produce a guideline for the selection of annotators. The LIMSI team is working towards the definition of a coding scheme integrating emotion, context and multimodal annotations. We present the current defined coding scheme for emotion annotation, and the use of soft vectors for representing a mixture of emotions. This paper describes a perceptive test of emotion annotations and the results obtained with 40 different coders on a subset of complex real-life emotional segments selected from the EmoTV Corpus collected at LIMSI. The results of this first study validate previous annotations of emotion mixtures and highlight the difference of annotation between male and female coders.

2004

The aim of the MEDIA project is to design and test a methodology for the evaluat ion of context-dependent and independent spoken dialogue systems. We propose an evaluation paradigm based on the use of test suites from real-world corpora and a common semantic representation and common metrics. This paradigm should allow us to diagnose the context-sensitive understanding capability of dialogue system s. This paradigm will be used within an evaluation campaign involving several si tes all of which will carry out the task of querying information from a database .

pdf
Reliability of Lexical and Prosodic Cues in Two Real-life Spoken Dialog Corpora
L. Devillers | I. Vasilescu
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)