2024
pdf
abs
SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus
Stephanie M. Lukin
|
Claire Bonial
|
Matthew Marge
|
Taylor A. Hudson
|
Cory J. Hayes
|
Kimberly Pollard
|
Anthony Baker
|
Ashley N. Foots
|
Ron Artstein
|
Felix Gervits
|
Mitchell Abrams
|
Cassidy Henry
|
Lucia Donatelli
|
Anton Leuski
|
Susan G. Hill
|
David Traum
|
Clare Voss
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterances and 310,095 words from 278 dialogues averaging 320 utterances per dialogue. The dialogues are aligned with the multi-modal data streams available during the experiments: 5,785 images and 30 maps. The corpus has been annotated with Abstract Meaning Representation and Dialogue-AMR to identify the speaker’s intent and meaning within an utterance, and with Transactional Units and Relations to track relationships between utterances to reveal patterns of the Dialogue Structure. We describe how the corpus and its annotations have been used to develop autonomous human-robot systems and enable research in open questions of how humans speak to robots. We release this corpus to accelerate progress in autonomous, situated, human-robot dialogue, especially in the context of navigation tasks where details about the environment need to be discovered.
2022
pdf
abs
Interactive Evaluation of Dialog Track at DSTC9
Shikib Mehri
|
Yulan Feng
|
Carla Gordon
|
Seyed Hossein Alavi
|
David Traum
|
Maxine Eskenazi
Proceedings of the Thirteenth Language Resources and Evaluation Conference
The ultimate goal of dialog research is to develop systems that can be effectively used in interactive settings by real users. To this end, we introduced the Interactive Evaluation of Dialog Track at the 9th Dialog System Technology Challenge. This track consisted of two sub-tasks. The first sub-task involved building knowledge-grounded response generation models. The second sub-task aimed to extend dialog models beyond static datasets by assessing them in an interactive setting with real users. Our track challenges participants to develop strong response generation models and explore strategies that extend them to back-and-forth interactions with real users. The progression from static corpora to interactive evaluation introduces unique challenges and facilitates a more thorough assessment of open-domain dialog systems. This paper provides an overview of the track, including the methodology and results. Furthermore, it provides insights into how to best evaluate open-domain dialog models.
pdf
abs
Comparing Approaches to Language Understanding for Human-Robot Dialogue: An Error Taxonomy and Analysis
Ada Tur
|
David Traum
Proceedings of the Thirteenth Language Resources and Evaluation Conference
In this paper, we compare two different approaches to language understanding for a human-robot interaction domain in which a human commander gives navigation instructions to a robot. We contrast a relevance-based classifier with a GPT-2 model, using about 2000 input-output examples as training data. With this level of training data, the relevance-based model outperforms the GPT-2 based model 79% to 8%. We also present a taxonomy of types of errors made by each model, indicating that they have somewhat different strengths and weaknesses, so we also examine the potential for a combined model.
pdf
abs
Evaluation of Off-the-shelf Speech Recognizers on Different Accents in a Dialogue Domain
Divya Tadimeti
|
Kallirroi Georgila
|
David Traum
Proceedings of the Thirteenth Language Resources and Evaluation Conference
We evaluate several publicly available off-the-shelf (commercial and research) automatic speech recognition (ASR) systems on dialogue agent-directed English speech from speakers with General American vs. non-American accents. Our results show that the performance of the ASR systems for non-American accents is considerably worse than for General American accents. Depending on the recognizer, the absolute difference in performance between General American accents and all non-American accents combined can vary approximately from 2% to 12%, with relative differences varying approximately between 16% and 49%. This drop in performance becomes even larger when we consider specific categories of non-American accents indicating a need for more diligent collection of and training on non-native English speaker data in order to narrow this performance gap. There are performance differences across ASR systems, and while the same general pattern holds, with more errors for non-American accents, there are some accents for which the best recognizer is different than in the overall case. We expect these results to be useful for dialogue system designers in developing more robust inclusive dialogue systems, and for ASR providers in taking into account performance requirements for different accents.
2021
pdf
abs
Builder, we have done it: Evaluating & Extending Dialogue-AMR NLU Pipeline for Two Collaborative Domains
Claire Bonial
|
Mitchell Abrams
|
David Traum
|
Clare Voss
Proceedings of the 14th International Conference on Computational Semantics (IWCS)
We adopt, evaluate, and improve upon a two-step natural language understanding (NLU) pipeline that incrementally tames the variation of unconstrained natural language input and maps to executable robot behaviors. The pipeline first leverages Abstract Meaning Representation (AMR) parsing to capture the propositional content of the utterance, and second converts this into “Dialogue-AMR,” which augments standard AMR with information on tense, aspect, and speech acts. Several alternative approaches and training datasets are evaluated for both steps and corresponding components of the pipeline, some of which outperform the original. We extend the Dialogue-AMR annotation schema to cover a different collaborative instruction domain and evaluate on both domains. With very little training data, we achieve promising performance in the new domain, demonstrating the scalability of this approach.
2020
pdf
abs
Dialogue-AMR: Abstract Meaning Representation for Dialogue
Claire Bonial
|
Lucia Donatelli
|
Mitchell Abrams
|
Stephanie M. Lukin
|
Stephen Tratz
|
Matthew Marge
|
Ron Artstein
|
David Traum
|
Clare Voss
Proceedings of the Twelfth Language Resources and Evaluation Conference
This paper describes a schema that enriches Abstract Meaning Representation (AMR) in order to provide a semantic representation for facilitating Natural Language Understanding (NLU) in dialogue systems. AMR offers a valuable level of abstraction of the propositional content of an utterance; however, it does not capture the illocutionary force or speaker’s intended contribution in the broader dialogue context (e.g., make a request or ask a question), nor does it capture tense or aspect. We explore dialogue in the domain of human-robot interaction, where a conversational robot is engaged in search and navigation tasks with a human partner. To address the limitations of standard AMR, we develop an inventory of speech acts suitable for our domain, and present “Dialogue-AMR”, an enhanced AMR that represents not only the content of an utterance, but the illocutionary force behind it, as well as tense and aspect. To showcase the coverage of the schema, we use both manual and automatic methods to construct the “DialAMR” corpus—a corpus of human-robot dialogue annotated with standard AMR and our enriched Dialogue-AMR schema. Our automated methods can be used to incorporate AMR into a larger NLU pipeline supporting human-robot dialogue.
pdf
abs
Predicting Ratings of Real Dialogue Participants from Artificial Data and Ratings of Human Dialogue Observers
Kallirroi Georgila
|
Carla Gordon
|
Volodymyr Yanov
|
David Traum
Proceedings of the Twelfth Language Resources and Evaluation Conference
We collected a corpus of dialogues in a Wizard of Oz (WOz) setting in the Internet of Things (IoT) domain. We asked users participating in these dialogues to rate the system on a number of aspects, namely, intelligence, naturalness, personality, friendliness, their enjoyment, overall quality, and whether they would recommend the system to others. Then we asked dialogue observers, i.e., Amazon Mechanical Turkers (MTurkers), to rate these dialogues on the same aspects. We also generated simulated dialogues between dialogue policies and simulated users and asked MTurkers to rate them again on the same aspects. Using linear regression, we developed dialogue evaluation functions based on features from the simulated dialogues and the MTurkers’ ratings, the WOz dialogues and the MTurkers’ ratings, and the WOz dialogues and the WOz participants’ ratings. We applied all these dialogue evaluation functions to a held-out portion of our WOz dialogues, and we report results on the predictive power of these different types of dialogue evaluation functions. Our results suggest that for three conversational aspects (intelligence, naturalness, overall quality) just training evaluation functions on simulated data could be sufficient.
pdf
abs
Which Model Should We Use for a Real-World Conversational Dialogue System? a Cross-Language Relevance Model or a Deep Neural Net?
Seyed Hossein Alavi
|
Anton Leuski
|
David Traum
Proceedings of the Twelfth Language Resources and Evaluation Conference
We compare two models for corpus-based selection of dialogue responses: one based on cross-language relevance with a cross-language LSTM model. Each model is tested on multiple corpora, collected from two different types of dialogue source material. Results show that while the LSTM model performs adequately on a very large corpus (millions of utterances), its performance is dominated by the cross-language relevance model for a more moderate-sized corpus (ten thousands of utterances).
pdf
abs
Exploring a Choctaw Language Corpus with Word Vectors and Minimum Distance Length
Jacqueline Brixey
|
David Sides
|
Timothy Vizthum
|
David Traum
|
Khalil Iskarous
Proceedings of the Twelfth Language Resources and Evaluation Conference
This work introduces additions to the corpus ChoCo, a multimodal corpus for the American indigenous language Choctaw. Using texts from the corpus, we develop new computational resources by using two off-the-shelf tools: word2vec and Linguistica. Our work illustrates how these tools can be successfully implemented with a small corpus.
pdf
abs
Evaluation of Off-the-shelf Speech Recognizers Across Diverse Dialogue Domains
Kallirroi Georgila
|
Anton Leuski
|
Volodymyr Yanov
|
David Traum
Proceedings of the Twelfth Language Resources and Evaluation Conference
We evaluate several publicly available off-the-shelf (commercial and research) automatic speech recognition (ASR) systems across diverse dialogue domains (in US-English). Our evaluation is aimed at non-experts with limited experience in speech recognition. Our goal is not only to compare a variety of ASR systems on several diverse data sets but also to measure how much ASR technology has advanced since our previous large-scale evaluations on the same data sets. Our results show that the performance of each speech recognizer can vary significantly depending on the domain. Furthermore, despite major recent progress in ASR technology, current state-of-the-art speech recognizers perform poorly in domains that require special vocabulary and language models, and under noisy conditions. We expect that our evaluation will prove useful to ASR consumers and dialogue system designers.
2019
pdf
abs
A Blissymbolics Translation System
Usman Sohail
|
David Traum
Proceedings of the Eighth Workshop on Speech and Language Processing for Assistive Technologies
Blissymbolics (Bliss) is a pictographic writing system that is used by people with communication disorders. Bliss attempts to create a writing system that makes words easier to distinguish by using pictographic symbols that encapsulate meaning rather than sound, as the English alphabet does for example. Users of Bliss rely on human interpreters to use Bliss. We created a translation system from Bliss to natural English with the hopes of decreasing the reliance on human interpreters by the Bliss community. We first discuss the basic rules of Blissymbolics. Then we point out some of the challenges associated with developing computer assisted tools for Blissymbolics. Next we talk about our ongoing work in developing a translation system, including current limitations, and future work. We conclude with a set of examples showing the current capabilities of our translation system.
pdf
abs
Augmenting Abstract Meaning Representation for Human-Robot Dialogue
Claire Bonial
|
Lucia Donatelli
|
Stephanie M. Lukin
|
Stephen Tratz
|
Ron Artstein
|
David Traum
|
Clare Voss
Proceedings of the First International Workshop on Designing Meaning Representations
We detail refinements made to Abstract Meaning Representation (AMR) that make the representation more suitable for supporting a situated dialogue system, where a human remotely controls a robot for purposes of search and rescue and reconnaissance. We propose 36 augmented AMRs that capture speech acts, tense and aspect, and spatial information. This linguistic information is vital for representing important distinctions, for example whether the robot has moved, is moving, or will move. We evaluate two existing AMR parsers for their performance on dialogue data. We also outline a model for graph-to-graph conversion, in which output from AMR parsers is converted into our refined AMRs. The design scheme presented here, though task-specific, is extendable for broad coverage of speech acts using AMR in future task-independent work.
pdf
bib
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Anna Korhonen
|
David Traum
|
Lluís Màrquez
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
2018
pdf
abs
Consequences and Factors of Stylistic Differences in Human-Robot Dialogue
Stephanie Lukin
|
Kimberly Pollard
|
Claire Bonial
|
Matthew Marge
|
Cassidy Henry
|
Ron Artstein
|
David Traum
|
Clare Voss
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
This paper identifies stylistic differences in instruction-giving observed in a corpus of human-robot dialogue. Differences in verbosity and structure (i.e., single-intent vs. multi-intent instructions) arose naturally without restrictions or prior guidance on how users should speak with the robot. Different styles were found to produce different rates of miscommunication, and correlations were found between style differences and individual user variation, trust, and interaction experience with the robot. Understanding potential consequences and factors that influence style can inform design of dialogue systems that are robust to natural variation from human users.
pdf
Dialogue Structure Annotation for Multi-Floor Interaction
David Traum
|
Cassidy Henry
|
Stephanie Lukin
|
Ron Artstein
|
Felix Gervits
|
Kimberly Pollard
|
Claire Bonial
|
Su Lei
|
Clare Voss
|
Matthew Marge
|
Cory Hayes
|
Susan Hill
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
The Niki and Julie Corpus: Collaborative Multimodal Dialogues between Humans, Robots, and Virtual Agents
Ron Artstein
|
Jill Boberg
|
Alesia Gainer
|
Jonathan Gratch
|
Emmanuel Johnson
|
Anton Leuski
|
Gale Lucas
|
David Traum
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
Identification of Personal Information Shared in Chat-Oriented Dialogue
Sarah Fillwock
|
David Traum
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
abs
ScoutBot: A Dialogue System for Collaborative Navigation
Stephanie M. Lukin
|
Felix Gervits
|
Cory J. Hayes
|
Pooja Moolchandani
|
Anton Leuski
|
John G. Rogers III
|
Carlos Sanchez Amaro
|
Matthew Marge
|
Clare R. Voss
|
David Traum
Proceedings of ACL 2018, System Demonstrations
ScoutBot is a dialogue interface to physical and simulated robots that supports collaborative exploration of environments. The demonstration will allow users to issue unconstrained spoken language commands to ScoutBot. ScoutBot will prompt for clarification if the user’s instruction needs additional input. It is trained on human-robot dialogue collected from Wizard-of-Oz experiments, where robot responses were initiated by a human wizard in previous interactions. The demonstration will show a simulated ground robot (Clearpath Jackal) in a simulated environment supported by ROS (Robot Operating System).
2017
pdf
abs
Exploring Variation of Natural Human Commands to a Robot in a Collaborative Navigation Task
Matthew Marge
|
Claire Bonial
|
Ashley Foots
|
Cory Hayes
|
Cassidy Henry
|
Kimberly Pollard
|
Ron Artstein
|
Clare Voss
|
David Traum
Proceedings of the First Workshop on Language Grounding for Robotics
Robot-directed communication is variable, and may change based on human perception of robot capabilities. To collect training data for a dialogue system and to investigate possible communication changes over time, we developed a Wizard-of-Oz study that (a) simulates a robot’s limited understanding, and (b) collects dialogues where human participants build a progressively better mental model of the robot’s understanding. With ten participants, we collected ten hours of human-robot dialogue. We analyzed the structure of instructions that participants gave to a remote robot before it responded. Our findings show a general initial preference for including metric information (e.g., move forward 3 feet) over landmarks (e.g., move to the desk) in motion commands, but this decreased over time, suggesting changes in perception.
pdf
abs
DialPort, Gone Live: An Update After A Year of Development
Kyusong Lee
|
Tiancheng Zhao
|
Yulun Du
|
Edward Cai
|
Allen Lu
|
Eli Pincus
|
David Traum
|
Stefan Ultes
|
Lina M. Rojas-Barahona
|
Milica Gasic
|
Steve Young
|
Maxine Eskenazi
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
DialPort collects user data for connected spoken dialog systems. At present six systems are linked to a central portal that directs the user to the applicable system and suggests systems that the user may be interested in. User data has started to flow into the system.
2016
pdf
Analyzing the Effect of Entrainment on Dialogue Acts
Masahiro Mizukami
|
Koichiro Yoshino
|
Graham Neubig
|
David Traum
|
Satoshi Nakamura
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
abs
Towards a Multi-dimensional Taxonomy of Stories in Dialogue
Kathryn J. Collins
|
David Traum
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
In this paper, we present a taxonomy of stories told in dialogue. We based our scheme on prior work analyzing narrative structure and method of telling, relation to storyteller identity, as well as some categories particular to dialogue, such as how the story gets introduced. Our taxonomy currently has 5 major dimensions, with most having sub-dimensions - each dimension has an associated set of dimension-specific labels. We adapted an annotation tool for this taxonomy and have annotated portions of two different dialogue corpora, Switchboard and the Distress Analysis Interview Corpus. We present examples of some of the tags and concepts with stories from Switchboard, and some initial statistics of frequencies of the tags.
pdf
abs
Towards Automatic Identification of Effective Clues for Team Word-Guessing Games
Eli Pincus
|
David Traum
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Team word-guessing games where one player, the clue-giver, gives clues attempting to elicit a target-word from another player, the receiver, are a popular form of entertainment and also used for educational purposes. Creating an engaging computational agent capable of emulating a talented human clue-giver in a timed word-guessing game depends on the ability to provide effective clues (clues able to elicit a correct guess from a human receiver). There are many available web resources and databases that can be mined for the raw material for clues for target-words; however, a large number of those clues are unlikely to be able to elicit a correct guess from a human guesser. In this paper, we propose a method for automatically filtering a clue corpus for effective clues for an arbitrary target-word from a larger set of potential clues, using machine learning on a set of features of the clues, including point-wise mutual information between a clue’s constituent words and a clue’s target-word. The results of the experiments significantly improve the average clue quality over previous approaches, and bring quality rates in-line with measures of human clue quality derived from a corpus of human-human interactions. The paper also introduces the data used to develop this method; audio recordings of people making guesses after having heard the clues being spoken by a synthesized voice.
pdf
New Dimensions in Testimony Demonstration
Ron Artstein
|
Alesia Gainer
|
Kallirroi Georgila
|
Anton Leuski
|
Ari Shapiro
|
David Traum
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
2015
pdf
Reinforcement Learning in Multi-Party Trading Dialog
Takuya Hiraoka
|
Kallirroi Georgila
|
Elnaz Nouri
|
David Traum
|
Satoshi Nakamura
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
Which Synthetic Voice Should I Choose for an Evocative Task?
Eli Pincus
|
Kallirroi Georgila
|
David Traum
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
Evaluating Spoken Dialogue Processing for Time-Offset Interaction
David Traum
|
Kallirroi Georgila
|
Ron Artstein
|
Anton Leuski
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
The Real Challenge 2014: Progress and Prospects
Maxine Eskenazi
|
Alan W Black
|
Sungjin Lee
|
David Traum
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue
2014
pdf
Initiative Taking in Negotiation
Elnaz Nouri
|
David Traum
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)
pdf
SAWDUST: a Semi-Automated Wizard Dialogue Utterance Selection Tool for domain-independent large-domain dialogue
Sudeep Gandhe
|
David Traum
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)
pdf
A Demonstration of Dialogue Processing in SimSensei Kiosk
Fabrizio Morbini
|
David DeVault
|
Kallirroi Georgila
|
Ron Artstein
|
David Traum
|
Louis-Philippe Morency
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)
pdf
abs
The Distress Analysis Interview Corpus of human and computer interviews
Jonathan Gratch
|
Ron Artstein
|
Gale Lucas
|
Giota Stratou
|
Stefan Scherer
|
Angela Nazarian
|
Rachel Wood
|
Jill Boberg
|
David DeVault
|
Stacy Marsella
|
David Traum
|
Skip Rizzo
|
Louis-Philippe Morency
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
The Distress Analysis Interview Corpus (DAIC) contains clinical interviews designed to support the diagnosis of psychological distress conditions such as anxiety, depression, and post traumatic stress disorder. The interviews are conducted by humans, human controlled agents and autonomous agents, and the participants include both distressed and non-distressed individuals. Data collected include audio and video recordings and extensive questionnaire responses; parts of the corpus have been transcribed and annotated for a variety of verbal and non-verbal features. The corpus has been used to support the creation of an automated interviewer agent, and for research on the automatic identification of psychological distress.
pdf
Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies
Kallirroi Georgila
|
Claire Nelson
|
David Traum
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2013
pdf
Verbal indicators of psychological distress in interactive dialogue with a virtual human
David DeVault
|
Kallirroi Georgila
|
Ron Artstein
|
Fabrizio Morbini
|
David Traum
|
Stefan Scherer
|
Albert Skip Rizzo
|
Louis-Philippe Morency
Proceedings of the SIGDIAL 2013 Conference
pdf
Surface Text based Dialogue Models for Virtual Humans
Sudeep Gandhe
|
David Traum
Proceedings of the SIGDIAL 2013 Conference
pdf
Roundtable: An Online Framework for Building Web-based Conversational Agents
Eric Forbell
|
Nicolai Kalisch
|
Fabrizio Morbini
|
Kelly Christoffersen
|
Kenji Sagae
|
David Traum
|
Albert A. Rizzo
Proceedings of the SIGDIAL 2013 Conference
pdf
Which ASR should I choose for my dialogue system?
Fabrizio Morbini
|
Kartik Audhkhasi
|
Kenji Sagae
|
Ron Artstein
|
Doğan Can
|
Panayiotis Georgiou
|
Shri Narayanan
|
Anton Leuski
|
David Traum
Proceedings of the SIGDIAL 2013 Conference
pdf
A method for the approximation of incremental understanding of explicit utterance meaning using predictive models in finite domains
David DeVault
|
David Traum
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2012
pdf
abs
ISO 24617-2: A semantically-based standard for dialogue annotation
Harry Bunt
|
Jan Alexandersson
|
Jae-Woong Choe
|
Alex Chengyu Fang
|
Koiti Hasida
|
Volha Petukhova
|
Andrei Popescu-Belis
|
David Traum
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This paper summarizes the latest, final version of ISO standard 24617-2 ``Semantic annotation framework, Part 2: Dialogue acts"""". Compared to the preliminary version ISO DIS 24617-2:2010, described in Bunt et al. (2010), the final version additionally includes concepts for annotating rhetorical relations between dialogue units, defines a full-blown compositional semantics for the Dialogue Act Markup Language DiAML (resulting, as a side-effect, in a different treatment of functional dependence relations among dialogue acts and feedback dependence relations); and specifies an optimally transparent XML-based reference format for the representation of DiAML annotations, based on the systematic application of the notion of `ideal concrete syntax'. We describe these differences and briefly discuss the design and implementation of an incremental method for dialogue act recognition, which proves the usability of the ISO standard for automatic dialogue annotation.
pdf
abs
Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems
Kallirroi Georgila
|
Alan Black
|
Kenji Sagae
|
David Traum
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
The current practice in virtual human dialogue systems is to use professional human recordings or limited-domain speech synthesis. Both approaches lead to good performance but at a high cost. To determine the best trade-off between performance and cost, we perform a systematic evaluation of human and synthesized voices with regard to naturalness, conversational aspect, and likability. We vary the type (in-domain vs. out-of-domain), length, and content of utterances, and take into account the age and native language of raters as well as their familiarity with speech synthesis. We present detailed results from two studies, a pilot one and one run on Amazon's Mechanical Turk. Our results suggest that a professional human voice can supersede both an amateur human voice and synthesized voices. Also, a high-quality general-purpose voice or a good limited-domain voice can perform better than amateur human recordings. We do not find any significant differences between the performance of a high-quality general-purpose voice and a limited-domain voice, both trained with speech recorded by actors. As expected, the high-quality general-purpose voice is rated higher than the limited-domain voice for out-of-domain sentences and lower for in-domain sentences. There is also a trend for long or negative-content utterances to receive lower ratings.
pdf
abs
The Twins Corpus of Museum Visitor Questions
Priti Aggarwal
|
Ron Artstein
|
Jillian Gerten
|
Athanasios Katsamanis
|
Shrikanth Narayanan
|
Angela Nazarian
|
David Traum
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
The Twins corpus is a collection of utterances spoken in interactions with two virtual characters who serve as guides at the Museum of Science in Boston. The corpus contains about 200,000 spoken utterances from museum visitors (primarily children) as well as from trained handlers who work at the museum. In addition to speech recordings, the corpus contains the outputs of speech recognition performed at the time of utterance as well as the system interpretation of the utterances. Parts of the corpus have been manually transcribed and annotated for question interpretation. The corpus has been used for improving performance of the museum characters and for a variety of research projects, such as phonetic-based Natural Language Understanding, creation of conversational characters from text resources, dialogue policy learning, and research on patterns of user interaction. It has the potential to be used for research on children's speech and on language used when talking to a virtual human.
pdf
Reinforcement Learning of Question-Answering Dialogue Policies for Virtual Museum Guides
Teruhisa Misu
|
Kallirroi Georgila
|
Anton Leuski
|
David Traum
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
A Demonstration of Incremental Speech Understanding and Confidence Estimation in a Virtual Human Dialogue System
David DeVault
|
David Traum
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
A Mixed-Initiative Conversational Dialogue System for Healthcare
Fabrizio Morbini
|
Eric Forbell
|
David DeVault
|
Kenji Sagae
|
David Traum
|
Albert Rizzo
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
pdf
bib
NAACL-HLT Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data (SDCTD 2012)
Maxine Eskenazi
|
Alan Black
|
David Traum
NAACL-HLT Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data (SDCTD 2012)
pdf
Incremental Speech Understanding in a Multi-Party Virtual Human Dialogue System
David DeVault
|
David Traum
Proceedings of the Demonstration Session at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2011
pdf
bib
Proceedings of the SIGDIAL 2011 Conference
Joyce Y. Chai
|
Johanna D. Moore
|
Rebecca J. Passonneau
|
David R. Traum
Proceedings of the SIGDIAL 2011 Conference
pdf
An Annotation Scheme for Cross-Cultural Argumentation and Persuasion Dialogues
Kallirroi Georgila
|
Ron Artstein
|
Angela Nazarian
|
Michael Rushforth
|
David Traum
|
Katia Sycara
Proceedings of the SIGDIAL 2011 Conference
pdf
Rapid Development of Advanced Question-Answering Characters by Non-experts
Sudeep Gandhe
|
Alysa Taylor
|
Jillian Gerten
|
David Traum
Proceedings of the SIGDIAL 2011 Conference
2010
pdf
Don’t tell anyone! Two Experiments on Gossip Conversations
Jenny Brusk
|
Ron Artstein
|
David Traum
Proceedings of the SIGDIAL 2010 Conference
pdf
I’ve said it before, and I’ll say it again: An empirical investigation of the upper bound of the selection approach to dialogue
Sudeep Gandhe
|
David Traum
Proceedings of the SIGDIAL 2010 Conference
pdf
abs
Towards an ISO Standard for Dialogue Act Annotation
Harry Bunt
|
Jan Alexandersson
|
Jean Carletta
|
Jae-Woong Choe
|
Alex Chengyu Fang
|
Koiti Hasida
|
Kiyong Lee
|
Volha Petukhova
|
Andrei Popescu-Belis
|
Laurent Romary
|
Claudia Soria
|
David Traum
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper describes an ISO project which aims at developing a standard for annotating spoken and multimodal dialogue with semantic information concerning the communicative functions of utterances, the kind of semantic content they address, and their relations with what was said and done earlier in the dialogue. The project, ISO 24617-2 ""Semantic annotation framework, Part 2: Dialogue acts"", is currently at DIS stage. The proposed annotation schema distinguishes 9 orthogonal dimensions, allowing each functional segment in dialogue to have a function in each of these dimensions, thus accounting for the multifunctionality that utterances in dialogue often have. A number of core communicative functions is defined in the form of ISO data categories, available at
http://semantic-annotation.uvt.nl/dialogue-acts/iso-datcats.pdf; they are divided into ""dimension-specific"" functions, which can be used only in a particular dimension, such as Turn Accept in the Turn Management dimension, and ""general-purpose"" functions, which can be used in any dimension, such as Inform and Request. An XML-based annotation language, ""DiAML"" is defined, with an abstract syntax, a semantics, and a concrete syntax.
pdf
abs
NPCEditor: A Tool for Building Question-Answering Characters
Anton Leuski
|
David Traum
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
NPCEditor is a system for building and deploying virtual characters capable of engaging a user in spoken dialog on a limited domain. The dialogue may take any form as long as the character responses can be specified a priori. For example, NPCEditor has been used for constructing question answering characters where a user asks questions and the character responds, but other scenarios are possible. At the core of the system is a state of the art statistical language classification technology for mapping from user's text input to system responses. NPCEditor combines the classifier with a database that stores the character information and relevant language data, a server that allows the character designer to deploy the completed characters, and a user-friendly editor that helps the designer to accomplish both character design and deployment tasks. In the paper we define the overall system architecture, describe individual NPCEditor components, and guide the reader through the steps of building a virtual character.
pdf
abs
Dialogues in Context: An Objective User-Oriented Evaluation Approach for Virtual Human Dialogue
Susan Robinson
|
Antonio Roque
|
David Traum
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
As conversational agents are now being developed to encounter more complex dialogue situations it is increasingly difficult to find satisfactory methods for evaluating these agents. Task-based measures are insufficient where there is no clearly defined task. While user-based evaluation methods may give a general sense of the quality of an agent's performance, they shed little light on the relative quality or success of specific features of dialogue that are necessary for system improvement. This paper examines current dialogue agent evaluation practices and motivates the need for a more detailed approach for defining and measuring the quality of dialogues between agent and user. We present a framework for evaluating the dialogue competence of artificial agents involved in complex and underspecified tasks when conversing with people. A multi-part coding scheme is proposed that provides a qualitative analysis of human utterances, and rates the appropriateness of the agent's responses to these utterances. The scheme is outlined, and then used to evaluate Staff Duty Officer Moleno, a virtual guide in Second Life.
pdf
abs
Practical Evaluation of Speech Recognizers for Virtual Human Dialogue Systems
Xuchen Yao
|
Pravin Bhutada
|
Kallirroi Georgila
|
Kenji Sagae
|
Ron Artstein
|
David Traum
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
We perform a large-scale evaluation of multiple off-the-shelf speech recognizers across diverse domains for virtual human dialogue systems. Our evaluation is aimed at speech recognition consumers and potential consumers with limited experience with readily available recognizers. We focus on practical factors to determine what levels of performance can be expected from different available recognizers in various projects featuring different types of conversational utterances. Our results show that there is no single recognizer that outperforms all other recognizers in all domains. The performance of each recognizer may vary significantly depending on the domain, the size and perplexity of the corpus, the out-of-vocabulary rate, and whether acoustic and language model adaptation has been used or not. We expect that our evaluation will prove useful to other speech recognition consumers, especially in the dialogue community, and will shed some light on the key problem in spoken dialogue systems of selecting the most suitable available speech recognition system for a particular application, and what impact training will have.
pdf
Interpretation of Partial Utterances in Virtual Human Dialogue Systems
Kenji Sagae
|
David DeVault
|
David Traum
Proceedings of the NAACL HLT 2010 Demonstration Session
2009
pdf
Towards Natural Language Understanding of Partial Speech Recognition Results in Dialogue Systems
Kenji Sagae
|
Gwen Christian
|
David DeVault
|
David Traum
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
pdf
A computational account of comparative implicatures for a spoken dialogue agent
Luciana Benotti
|
David Traum
Proceedings of the Eight International Conference on Computational Semantics
pdf
bib
Can I Finish? Learning When to Respond to Incremental Interpretation Results in Interactive Dialogue
David DeVault
|
Kenji Sagae
|
David Traum
Proceedings of the SIGDIAL 2009 Conference
pdf
bib
HCSNet Plenary Talk: Spoken Dialogue Models for Virtual Humans
David Traum
Proceedings of the Australasian Language Technology Association Workshop 2009
2008
pdf
Degrees of Grounding Based on Evidence of Understanding
Antonio Roque
|
David Traum
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
pdf
Evaluation Understudy for Dialogue Coherence Models
Sudeep Gandhe
|
David Traum
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
pdf
Making Grammar-Based Generation Easier to Deploy in Dialogue Systems
David DeVault
|
David Traum
|
Ron Artstein
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
pdf
Practical Grammar-Based NLG from Examples
David DeVault
|
David Traum
|
Ron Artstein
Proceedings of the Fifth International Natural Language Generation Conference
pdf
abs
A Common Ground for Virtual Humans: Using an Ontology in a Natural Language Oriented Virtual Human Architecture
Arno Hartholt
|
Thomas Russ
|
David Traum
|
Eduard Hovy
|
Susan Robinson
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
When dealing with large, distributed systems that use state-of-the-art components, individual components are usually developed in parallel. As development continues, the decoupling invariably leads to a mismatch between how these components internally represent concepts and how they communicate these representations to other components: representations can get out of synch, contain localized errors, or become manageable only by a small group of experts for each module. In this paper, we describe the use of an ontology as part of a complex distributed virtual human architecture in order to enable better communication between modules while improving the overall flexibility needed to change or extend the system. We focus on the natural language understanding capabilities of this architecture and the relationship between language and concepts within the entire system in general and the ontology in particular.
pdf
abs
What would you Ask a conversational Agent? Observations of Human-Agent Dialogues in a Museum Setting
Susan Robinson
|
David Traum
|
Midhun Ittycheriah
|
Joe Henderer
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Embodied Conversational Agents have typically been constructed for use in limited domain applications, and tested in very specialized environments. Only in recent years have there been more cases of moving agents into wider public applications (e.g.Bell et al., 2003; Kopp et al., 2005). Yet little analysis has been done to determine the differing needs, expectations, and behavior of human users in these environments. With an increasing trend for virtual characters to go public, we need to expand our understanding of what this entails for the design and capabilities of our characters. This paper explores these issues through an analysis of a corpus that has been collected since December 2006, from interactions with the virtual character Sgt Blackwell at the Cooper Hewitt Museum in New York. The analysis includes 82 hierarchical categories of user utterances, as well as specific observations on user preferences and behaviors drawn from interactions with Blackwell.
2007
pdf
A Model of Compliance and Emotion for Potentially Adversarial Dialogue Agents
Antonio Roque
|
David Traum
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue
pdf
Hassan: A Virtual Human for Tactical Questioning
David Traum
|
Antonio Roque
|
Anton Leuski
|
Panayiotis Georgiou
|
Jillian Gerten
|
Bilyana Martinovski
|
Shrikanth Narayanan
|
Susan Robinson
|
Ashish Vaswani
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue
pdf
Dynamic Movement and Positioning of Embodied Agents in Multiparty Conversations
Dušan Jan
|
David Traum
Proceedings of the Workshop on Embodied Language Processing
2006
pdf
Building Effective Question Answering Characters
Anton Leuski
|
Ronakkumar Patel
|
David Traum
|
Brandon Kennedy
Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue
pdf
An Information State-Based Dialogue Manager for Call for Fire Dialogues
Antonio Roque
|
David Traum
Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue
2005
pdf
Transonics: A Practical Speech-to-Speech Translator for English-Farsi Medical Dialogs
Robert Belvin
|
Emil Ettelaie
|
Sudeep Gandhe
|
Panayiotis Georgiou
|
Kevin Knight
|
Daniel Marcu
|
Scott Millward
|
Shrikanth Narayanan
|
Howard Neely
|
David Traum
Proceedings of the ACL Interactive Poster and Demonstration Sessions
pdf
Dealing with Doctors: A Virtual Human for Non-team Interaction
David Traum
|
William Swartout
|
Jonathan Gratch
|
Stacy Marsella
|
Patrick Kenny
|
Eduard Hovy
|
Shri Narayanan
|
Ed Fast
|
Bilyana Martinovski
|
Rahul Baghat
|
Susan Robinson
|
Andrew Marshall
|
Dagen Wang
|
Sudeep Gandhe
|
Anton Leuski
Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue
2004
pdf
Evaluation of Transcription and Annotation Tools for a Multi-modal, Multi-party Dialogue Corpus
Saurabh Garg
|
Bilyana Martinovski
|
Susan Robinson
|
Jens Stephan
|
Joel Tetreault
|
David R. Traum
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
pdf
Issues in Corpus Development for Multi-party Multi-modal Task-oriented Dialogue
Susan Robinson
|
Bilyana Martinovski
|
Saurabh Garg
|
Jens Stephan
|
David Traum
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
pdf
Evaluation of Multi-party Virtual Reality Dialogue Interaction
David R. Traum
|
Susan Robinson
|
Jens Stephan
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
2001
pdf
Implicit cues for explicit generation: using telicity as a cue for tense structure in a Chinese to English MT system
Mari Olsen
|
David Traum
|
Carol van Ess-Dykema
|
Amy Weinberg
Proceedings of Machine Translation Summit VIII
2000
pdf
Telicity as a Cue to Temporaland Discourse Structure in Chinese-English Machine Translation
Mari Olsen
|
David Traum
|
Carol Van Ess-Dykema
|
Amy Weinberg
|
Ron Dolan
NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP
pdf
Generation from Lexical Conceptual Structures
David Traum
|
Nizar Habash
NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP
pdf
bib
Modelling Grounding and Discourse Obligations Using Update Rules
Colin Matheson
|
Massimo Poesio
|
David Traum
1st Meeting of the North American Chapter of the Association for Computational Linguistics
1999
pdf
A Two-level Approach to Coding Dialogue for Discourse Structure: Activities of the 1998 DRI Working Group on Higher-level Structures
David R. Traum
|
Christine H. Nakatani
Towards Standards and Tools for Discourse Tagging
1998
pdf
abs
A thematic hierarchy for efficient generation from lexical-conceptual structure
Bonnie Dorr
|
Nizar Habash
|
David Traum
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers
This paper describes an implemented algorithm for syntactic realization of a target-language sentence from an interlingual representation called Lexical Conceptual Structure (LCS). We provide a mapping between LCS thematic roles and Abstract Meaning Representation (AMR) relations; these relations serve as input to an off-the-shelf generator (Nitrogen). There are two contributions of this work: (1) the development of a thematic hierarchy that provides ordering information for realization of arguments in their surface positions; (2) the provision of a diagnostic tool for detecting inconsistencies in an existing online LCS-based lexicon that allows us to enhance principles for thematic-role assignment.
1996
pdf
Book Reviews: Spoken Natural Language Dialogue Systems: A Practical Approach
David R. Traum
Computational Linguistics, Volume 22, Number 3, September 1996
1994
pdf
bib
Discourse Obligations in Dialogue Processing
David R. Traum
|
James F. Allen
32nd Annual Meeting of the Association for Computational Linguistics
1993
pdf
Rhetorical Relations, Action and Intentionality in Conversation
David Traum
Intentionality and Structure in Discourse Relations