Junseok Park


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2021

pdf bib
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer
Gi-Cheon Kang | Junseok Park | Hwaran Lee | Byoung-Tak Zhang | Jin-Hwa Kim
Findings of the Association for Computational Linguistics: EMNLP 2021

Visual dialog is a task of answering a sequence of questions grounded in an image using the previous dialog history as context. In this paper, we study how to address two fundamental challenges for this task: (1) reasoning over underlying semantic structures among dialog rounds and (2) identifying several appropriate answers to the given question. To address these challenges, we propose a Sparse Graph Learning (SGL) method to formulate visual dialog as a graph structure learning task. SGL infers inherently sparse dialog structures by incorporating binary and score edges and leveraging a new structural loss function. Next, we introduce a Knowledge Transfer (KT) method that extracts the answer predictions from the teacher model and uses them as pseudo labels. We propose KT to remedy the shortcomings of single ground-truth labels, which severely limit the ability of a model to obtain multiple reasonable answers. As a result, our proposed model significantly improves reasoning capability compared to baseline methods and outperforms the state-of-the-art approaches on the VisDial v1.0 dataset. The source code is available at https://github.com/gicheonkang/SGLKT-VisDial.