Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models

Sanghwan Bae; Donghyun Kwak; Sungdong Kim; Donghoon Ham; Soyoung Kang; Sang-Woo Lee; Woomyoung Park

doi:10.18653/v1/2022.naacl-main.155

Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models

Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, Woomyoung Park

Abstract

Recent open-domain dialogue models have brought numerous breakthroughs. However, building a chat system is not scalable since it often requires a considerable volume of human-human dialogue data, especially when enforcing features such as persona, style, or safety. In this work, we study the challenge of imposing roles on open-domain dialogue systems, with the goal of making the systems maintain consistent roles while conversing naturally with humans. To accomplish this, the system must satisfy a role specification that includes certain conditions on the stated features as well as a system policy on whether or not certain types of utterances are allowed. For this, we propose an efficient data collection framework leveraging in-context few-shot learning of large-scale language models for building role-satisfying dialogue dataset from scratch. We then compare various architectures for open-domain dialogue systems in terms of meeting role specifications while maintaining conversational abilities. Automatic and human evaluations show that our models return few out-of-bounds utterances, keeping competitive performance on general metrics. We release a Korean dialogue dataset we built for further research.

Anthology ID:: 2022.naacl-main.155
Volume:: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2128–2150
Language:
URL:: https://aclanthology.org/2022.naacl-main.155
DOI:: 10.18653/v1/2022.naacl-main.155
Bibkey:
Cite (ACL):: Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, and Woomyoung Park. 2022. Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2128–2150, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models (Bae et al., NAACL 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/naacl-24-ws-corrections/2022.naacl-main.155.pdf
Video:: https://preview.aclanthology.org/naacl-24-ws-corrections/2022.naacl-main.155.mp4
Code: naver-ai/carecall-corpus
Data: CareCall

PDF Search Code Video