A Unified Dialogue User Simulator for Few-shot Data Augmentation

Dazhen Wan; Zheng Zhang; Qi Zhu; Lizi Liao; Minlie Huang

doi:10.18653/v1/2022.findings-emnlp.277

A Unified Dialogue User Simulator for Few-shot Data Augmentation

Dazhen Wan, Zheng Zhang, Qi Zhu, Lizi Liao, Minlie Huang

Abstract

Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment large-scale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with few-shot data. The experiments on a target dataset across multiple domains show that our proposed model brings remarkable performance increases through data augmentation.

Anthology ID:: 2022.findings-emnlp.277
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2022
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3788–3799
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2022.findings-emnlp.277/
DOI:: 10.18653/v1/2022.findings-emnlp.277
Bibkey:
Cite (ACL):: Dazhen Wan, Zheng Zhang, Qi Zhu, Lizi Liao, and Minlie Huang. 2022. A Unified Dialogue User Simulator for Few-shot Data Augmentation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 3788–3799, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: A Unified Dialogue User Simulator for Few-shot Data Augmentation (Wan et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2022.findings-emnlp.277.pdf

PDF Cite Search Fix data