Actor-Double-Critic: Incorporating Model-Based Critic for Task-Oriented Dialogue Systems

Yen-Chen Wu; Bo-Hsiang Tseng; Milica Gasic

doi:10.18653/v1/2020.findings-emnlp.75

Actor-Double-Critic: Incorporating Model-Based Critic for Task-Oriented Dialogue Systems

Yen-chen Wu, Bo-Hsiang Tseng, Milica Gasic

Abstract

In order to improve the sample-efficiency of deep reinforcement learning (DRL), we implemented imagination augmented agent (I2A) in spoken dialogue systems (SDS). Although I2A achieves a higher success rate than baselines by augmenting predicted future into a policy network, its complicated architecture introduces unwanted instability. In this work, we propose actor-double-critic (ADC) to improve the stability and overall performance of I2A. ADC simplifies the architecture of I2A to reduce excessive parameters and hyper-parameters. More importantly, a separate model-based critic shares parameters between actions and makes back-propagation explicit. In our experiments on Cambridge Restaurant Booking task, ADC enhances success rates considerably and shows robustness to imperfect environment models. In addition, ADC exhibits the stability and sample-efficiency as significantly reducing the baseline standard deviation of success rates and reaching the 80% success rate with half training data.

Anthology ID:: 2020.findings-emnlp.75
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2020
Month:: November
Year:: 2020
Address:: Online
Editors:: Trevor Cohn, Yulan He, Yang Liu
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 854–863
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2020.findings-emnlp.75/
DOI:: 10.18653/v1/2020.findings-emnlp.75
Bibkey:
Cite (ACL):: Yen-chen Wu, Bo-Hsiang Tseng, and Milica Gasic. 2020. Actor-Double-Critic: Incorporating Model-Based Critic for Task-Oriented Dialogue Systems. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 854–863, Online. Association for Computational Linguistics.
Cite (Informal):: Actor-Double-Critic: Incorporating Model-Based Critic for Task-Oriented Dialogue Systems (Wu et al., Findings 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2020.findings-emnlp.75.pdf

PDF Cite Search Fix data