Does a Virtual Talking Face Generate Proper Multimodal Cues to Draw User’s Attention to Points of Interest?

Stephan Raidt, Gérard Bailly, Frederic Elisei


Abstract
We present a series of experiments investigating face-to-face interaction between an Embodied Conversational Agent (ECA) and a human interlocutor. The ECA is embodied by a video realistic talking head with independent head and eye movements. For a beneficial application in face-to-face interaction, the ECA should be able to derive meaning from communicational gestures of a human interlocutor, and likewise to reproduce such gestures. Conveying its capability to interpret human behaviour, the system encourages the interlocutor to show appropriate natural activity. Therefore it is important that the ECA knows how to display what would correspond to mental states in humans. This allows to interpret the machine processes of the system in terms of human expressiveness and to assign them a corresponding meaning. Thus the system may maintain an interaction based on human patterns. During a first experiment we investigated the ability of our talking head to direct user attention with facial deictic cues (Raidt, Bailly et al. 2005). Users interact with the ECA during a simple card game offering different levels of help and guidance through facial deictic cues. We analyzed the users’ performance and their perception of the quality of assistance given by the ECA. The experiment showed that users profit from its presence and its facial deictic cues. In the continuative series of experiments presented here, we investigated the effect of an enhancement of the multimodality of the deictic gestures by adding a spoken instruction.
Anthology ID:
L06-1021
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/49_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Stephan Raidt, Gérard Bailly, and Frederic Elisei. 2006. Does a Virtual Talking Face Generate Proper Multimodal Cues to Draw User’s Attention to Points of Interest?. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Does a Virtual Talking Face Generate Proper Multimodal Cues to Draw User’s Attention to Points of Interest? (Raidt et al., LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/49_pdf.pdf