To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies

Dirk Väth; Ngoc Thang Vu

doi:10.18653/v1/W19-5908

To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies

Abstract

In this paper, we explore state-of-the-art deep reinforcement learning methods for dialog policy training such as prioritized experience replay, double deep Q-Networks, dueling network architectures and distributional learning. Our main findings show that each individual method improves the rewards and the task success rate but combining these methods in a Rainbow agent, which performs best across tasks and environments, is a non-trivial task. We, therefore, provide insights about the influence of each method on the combination and how to combine them to form a Rainbow agent.

Anthology ID:: W19-5908
Volume:: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue
Month:: September
Year:: 2019
Address:: Stockholm, Sweden
Venue:: SIGDIAL
SIG:: SIGDIAL
Publisher:: Association for Computational Linguistics
Note:
Pages:: 62–67
Language:
URL:: https://aclanthology.org/W19-5908
DOI:: 10.18653/v1/W19-5908
Bibkey:
Cite (ACL):: Dirk Väth and Ngoc Thang Vu. 2019. To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies. In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, pages 62–67, Stockholm, Sweden. Association for Computational Linguistics.
Cite (Informal):: To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies (Väth & Vu, SIGDIAL 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/paclic-22-ingestion/W19-5908.pdf

PDF Search