Kilian Weinberger
2022
Long-term Control for Dialogue Generation: Methods and Evaluation
Ramya Ramakrishnan
|
Hashan Narangodage
|
Mauro Schilman
|
Kilian Weinberger
|
Ryan McDonald
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Current approaches for controlling dialogue response generation are primarily focused on high-level attributes like style, sentiment, or topic. In this work, we focus on constrained long-term dialogue generation, which involves more fine-grained control and requires a given set of control words to appear in generated responses. This setting requires a model to not only consider the generation of these control words in the immediate context, but also produce utterances that will encourage the generation of the words at some time in the (possibly distant) future. We define the problem of constrained long-term control for dialogue generation, identify gaps in current methods for evaluation, and propose new metrics that better measure long-term control. We also propose a retrieval-augmented method that improves performance of long-term controlled generation via logit modification techniques. We show through experiments on three task-oriented dialogue datasets that our metrics better assess dialogue control relative to current alternatives and that our method outperforms state-of-the-art constrained generation baselines.
2018
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Xilun Chen
|
Yu Sun
|
Ben Athiwaratkun
|
Claire Cardie
|
Kilian Weinberger
Transactions of the Association for Computational Linguistics, Volume 6
In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN1) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exist. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems.
Search
Co-authors
- Ramya Ramakrishnan 1
- Hashan Narangodage 1
- Mauro Schilman 1
- Ryan McDonald 1
- Xilun Chen 1
- show all...