Hyunkyung Bae


2023

pdf
Injecting Comparison Skills in Task-Oriented Dialogue Systems for Database Search Results Disambiguation
Yongil Kim | Yerin Hwang | Joongbo Shin | Hyunkyung Bae | Kyomin Jung
Findings of the Association for Computational Linguistics: ACL 2023

In task-oriented dialogue (TOD) systems designed to aid users accomplish specific goals in one or more domains, the agent retrieves entities that satisfy user constraints from the database. However, when multiple database search results exist, an ambiguity occurs regarding which results to select and present to the user. Existing TOD systems handle this ambiguity by randomly selecting one or few results and presenting their names to the user. However, in a real scenario, users do not always accept a randomly recommended entity, and users should have access to more comprehensive information about the search results. To address this limitation, we propose a novel task called Comparison-Based database search Ambiguity handling (CBA), which handles ambiguity in database search results by comparing the properties of multiple entities to enable users to choose according to their preferences. Accordingly, we introduce a new framework for automatically collecting high-quality dialogue data along with the Disambiguating Schema-guided Dialogue (DSD) dataset, an augmented version of the SGD dataset. Experimental studies on the DSD dataset demonstrate that training baseline models with the dataset effectively address the CBA task. Our dataset and code will be publicized.

2022

pdf
Improving Multiple Documents Grounded Goal-Oriented Dialog Systems via Diverse Knowledge Enhanced Pretrained Language Model
Yunah Jang | Dongryeol Lee | Hyung Joo Park | Taegwan Kang | Hwanhee Lee | Hyunkyung Bae | Kyomin Jung
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering

In this paper, we mainly discuss about our submission to MultiDoc2Dial task, which aims to model the goal-oriented dialogues grounded in multiple documents. The proposed task is split into grounding span prediction and agent response generation. The baseline for the task is the retrieval augmented generation model, which consists of a dense passage retrieval model for the retrieval part and the BART model for the generation part. The main challenge of this task is that the system requires a great amount of pre-trained knowledge to generate answers grounded in multiple documents. To overcome this challenge, we adopt model pretraining, fine-tuning, and multi-task learning to enhance our model’s coverage of pretrained knowledge. We experimented with various settings of our method to show the effectiveness of our approaches.