VOILA: An Optimised Dialogue System for Interactively Learning Visually-Grounded Word Meanings (Demonstration System)

Yanchao Yu, Arash Eshghi, Oliver Lemon


Abstract
We present VOILA: an optimised, multi-modal dialogue agent for interactive learning of visually grounded word meanings from a human user. VOILA is: (1) able to learn new visual categories interactively from users from scratch; (2) trained on real human-human dialogues in the same domain, and so is able to conduct natural spontaneous dialogue; (3) optimised to find the most effective trade-off between the accuracy of the visual categories it learns and the cost it incurs to users. VOILA is deployed on Furhat, a human-like, multi-modal robot head with back-projection of the face, and a graphical virtual character.
Anthology ID:
W17-5524
Volume:
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
Month:
August
Year:
2017
Address:
Saarbrücken, Germany
Editors:
Kristiina Jokinen, Manfred Stede, David DeVault, Annie Louis
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
197–200
Language:
URL:
https://aclanthology.org/W17-5524
DOI:
10.18653/v1/W17-5524
Bibkey:
Cite (ACL):
Yanchao Yu, Arash Eshghi, and Oliver Lemon. 2017. VOILA: An Optimised Dialogue System for Interactively Learning Visually-Grounded Word Meanings (Demonstration System). In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pages 197–200, Saarbrücken, Germany. Association for Computational Linguistics.
Cite (Informal):
VOILA: An Optimised Dialogue System for Interactively Learning Visually-Grounded Word Meanings (Demonstration System) (Yu et al., SIGDIAL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/W17-5524.pdf