2021
pdf
abs
Athena 2.0: Contextualized Dialogue Management for an Alexa Prize SocialBot
Juraj Juraska
|
Kevin Bowden
|
Lena Reed
|
Vrindavan Harrison
|
Wen Cui
|
Omkar Patil
|
Rishi Rajasekaran
|
Angela Ramirez
|
Cecilia Li
|
Eduardo Zamora
|
Phillip Lee
|
Jeshwanth Bheemanpally
|
Rohan Pandey
|
Adwait Ratnaparkhi
|
Marilyn Walker
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Athena 2.0 is an Alexa Prize SocialBot that has been a finalist in the last two Alexa Prize Grand Challenges. One reason for Athena’s success is its novel dialogue management strategy, which allows it to dynamically construct dialogues and responses from component modules, leading to novel conversations with every interaction. Here we describe Athena’s system design and performance in the Alexa Prize during the 20/21 competition. A live demo of Athena as well as video recordings will provoke discussion on the state of the art in conversational AI.
2020
pdf
abs
Learning from Mistakes: Combining Ontologies via Self-Training for Dialogue Generation
Lena Reed
|
Vrindavan Harrison
|
Shereen Oraby
|
Dilek Hakkani-Tur
|
Marilyn Walker
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Natural language generators (NLGs) for task-oriented dialogue typically take a meaning representation (MR) as input, and are trained end-to-end with a corpus of MR/utterance pairs, where the MRs cover a specific set of dialogue acts and domain attributes. Creation of such datasets is labor intensive and time consuming. Therefore, dialogue systems for new domain ontologies would benefit from using data for pre-existing ontologies. Here we explore, for the first time, whether it is possible to train an NLG for a new larger ontology using existing training sets for the restaurant domain, where each set is based on a different ontology. We create a new, larger combined ontology, and then train an NLG to produce utterances covering it. For example, if one dataset has attributes for family friendly and rating information, and the other has attributes for decor and service, our aim is an NLG for the combined ontology that can produce utterances that realize values for family friendly, rating, decor and service. Initial experiments with a baseline neural sequence-to-sequence model show that this task is surprisingly challenging. We then develop a novel self-training method that identifies (errorful) model outputs, automatically constructs a corrected MR input to form a new (MR, utterance) training pair, and then repeatedly adds these new instances back into the training data. We then test the resulting model on a new test set. The result is a self-trained model whose performance is an absolute 75.4% improvement over the baseline model. We also report a human qualitative evaluation of the final model showing that it achieves high naturalness, semantic coherence and grammaticality.
2019
pdf
bib
abs
Maximizing Stylistic Control and Semantic Accuracy in NLG: Personality Variation and Discourse Contrast
Vrindavan Harrison
|
Lena Reed
|
Shereen Oraby
|
Marilyn Walker
Proceedings of the 1st Workshop on Discourse Structure in Neural NLG
Neural generation methods for task-oriented dialogue typically generate from a meaning representation that is populated using a database of domain information, such as a table of data describing a restaurant. While earlier work focused solely on the semantic fidelity of outputs, recent work has started to explore methods for controlling the style of the generated text while simultaneously achieving semantic accuracy. Here we experiment with two stylistic benchmark tasks, generating language that exhibits variation in personality, and generating discourse contrast. We report a huge performance improvement in both stylistic control and semantic accuracy over the state of the art on both of these benchmarks. We test several different models and show that putting stylistic conditioning in the decoder and eliminating the semantic re-ranker used in earlier models results in more than 15 points higher BLEU for Personality, with a reduction of semantic error to near zero. We also report an improvement from .75 to .81 in controlling contrast and a reduction in semantic error from 16% to 2%.
2018
pdf
abs
Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators
Shereen Oraby
|
Lena Reed
|
Shubhangi Tandon
|
Sharath T.S.
|
Stephanie Lukin
|
Marilyn Walker
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
Natural language generators for task-oriented dialogue must effectively realize system dialogue actions and their associated semantics. In many applications, it is also desirable for generators to control the style of an utterance. To date, work on task-oriented neural generation has primarily focused on semantic fidelity rather than achieving stylistic goals, while work on style has been done in contexts where it is difficult to measure content preservation. Here we present three different sequence-to-sequence models and carefully test how well they disentangle content and style. We use a statistical generator, Personage, to synthesize a new corpus of over 88,000 restaurant domain utterances whose style varies according to models of personality, giving us total control over both the semantic content and the stylistic variation in the training data. We then vary the amount of explicit stylistic supervision given to the three models. We show that our most explicit model can simultaneously achieve high fidelity to both semantic and stylistic goals: this model adds a context vector of 36 stylistic parameters as input to the hidden state of the encoder at each time step, showing the benefits of explicit stylistic supervision, even when the amount of training data is large.
pdf
abs
Can Neural Generators for Dialogue Learn Sentence Planning and Discourse Structuring?
Lena Reed
|
Shereen Oraby
|
Marilyn Walker
Proceedings of the 11th International Conference on Natural Language Generation
Responses in task-oriented dialogue systems often realize multiple propositions whose ultimate form depends on the use of sentence planning and discourse structuring operations. For example a recommendation may consist of an explicitly evaluative utterance e.g. Chanpen Thai is the best option, along with content related by the justification discourse relation, e.g. It has great food and service, that combines multiple propositions into a single phrase. While neural generation methods integrate sentence planning and surface realization in one end-to-end learning framework, previous work has not shown that neural generators can: (1) perform common sentence planning and discourse structuring operations; (2) make decisions as to whether to realize content in a single sentence or over multiple sentences; (3) generalize sentence planning and discourse relation operations beyond what was seen in training. We systematically create large training corpora that exhibit particular sentence planning operations and then test neural models to see what they learn. We compare models without explicit latent variables for sentence planning with ones that provide explicit supervision during training. We show that only the models with additional supervision can reproduce sentence planning and discourse operations and generalize to situations unseen in training.
2017
pdf
abs
Learning Lexico-Functional Patterns for First-Person Affect
Lena Reed
|
Jiaqi Wu
|
Shereen Oraby
|
Pranav Anand
|
Marilyn Walker
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Informal first-person narratives are a unique resource for computational models of everyday events and people’s affective reactions to them. People blogging about their day tend not to explicitly say I am happy. Instead they describe situations from which other humans can readily infer their affective reactions. However current sentiment dictionaries are missing much of the information needed to make similar inferences. We build on recent work that models affect in terms of lexical predicate functions and affect on the predicate’s arguments. We present a method to learn proxies for these functions from first-person narratives. We construct a novel fine-grained test set, and show that the patterns we learn improve our ability to predict first-person affective reactions to everyday events, from a Stanford sentiment baseline of .67F to .75F.
2016
pdf
Creating and Characterizing a Diverse Corpus of Sarcasm in Dialogue
Shereen Oraby
|
Vrindavan Harrison
|
Lena Reed
|
Ernesto Hernandez
|
Ellen Riloff
|
Marilyn Walker
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue
2015
pdf
And That’s A Fact: Distinguishing Factual and Emotional Argumentation in Online Dialogue
Shereen Oraby
|
Lena Reed
|
Ryan Compton
|
Ellen Riloff
|
Marilyn Walker
|
Steve Whittaker
Proceedings of the 2nd Workshop on Argumentation Mining
pdf
Generating Sentence Planning Variations for Story Telling
Stephanie Lukin
|
Lena Reed
|
Marilyn Walker
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue