Yan Cao


2020

pdf
Adaptive Dialog Policy Learning with Hindsight and User Modeling
Yan Cao | Keting Lu | Xiaoping Chen | Shiqi Zhang
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Reinforcement learning (RL) methods have been widely used for learning dialog policies. Sample efficiency, i.e., the efficiency of learning from limited dialog experience, is particularly important in RL-based dialog policy learning, because interacting with people is costly and low-quality dialog policies produce very poor user experience. In this paper, we develop LHUA (Learning with Hindsight, User modeling, and Adaptation) that, for the first time, enables dialog agents to adaptively learn with hindsight from both simulated and real users. Simulation and hindsight provide the dialog agent with more experience and more (positive) reinforcement respectively. Experimental results suggest that LHUA outperforms competitive baselines from the literature, including its no-simulation, no-adaptation, and no-hindsight counterparts.

2018

pdf
Analyzing Vocabulary Commonality Index Using Large-scaled Database of Child Language Development
Yan Cao | Yasuhiro Minami | Yuko Okumura | Tessei Kobayashi
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)