Counterfactual Matters: Intrinsic Probing For Dialogue State Tracking

Yi Huang; Junlan Feng; Xiaoting Wu; Xiaoyu Du

doi:10.18653/v1/2021.eancs-1.1

Counterfactual Matters: Intrinsic Probing For Dialogue State Tracking

Yi Huang, Junlan Feng, Xiaoting Wu, Xiaoyu Du

Abstract

A Dialogue State Tracker (DST) is a core component of modular task-oriented dialogue systems. Tremendous research progress has been made in past ten years to improve performance of DSTs especially on benchmark datasets. However, their generalization to novel and realistic scenarios beyond the held-out conversations is limited. In this paper, we design experimental studies to answer: 1) How does the distribution of dialogue data affect the performance of DSTs? 2) What are effective ways to probe counterfactual matter for DSTs? Our findings are: the performance variance of generative DSTs is not only due to the model structure itself, but can be attributed to the distribution of cross-domain values. Evaluating iconic generative DST models on MultiWOZ dataset with counterfactuals results in a significant performance drop of up to 34.64% (from 50.91% to 16.27%) in absolute joint goal accuracy. It is believed that our experimental results can guide the future work to better understand the intrinsic core of DST and rethink the suitable way for specific tasks given the application property.

Anthology ID:: 2021.eancs-1.1
Volume:: The First Workshop on Evaluations and Assessments of Neural Conversation Systems
Month:: November
Year:: 2021
Address:: Online
Editors:: Wei Wei, Bo Dai, Tuo Zhao, Lihong Li, Diyi Yang, Yun-Nung Chen, Y-Lan Boureau, Asli Celikyilmaz, Alborz Geramifard, Aman Ahuja, Haoming Jiang
Venue:: EANCS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–6
Language:
URL:: https://aclanthology.org/2021.eancs-1.1
DOI:: 10.18653/v1/2021.eancs-1.1
Bibkey:
Cite (ACL):: Yi Huang, Junlan Feng, Xiaoting Wu, and Xiaoyu Du. 2021. Counterfactual Matters: Intrinsic Probing For Dialogue State Tracking. In The First Workshop on Evaluations and Assessments of Neural Conversation Systems, pages 1–6, Online. Association for Computational Linguistics.
Cite (Informal):: Counterfactual Matters: Intrinsic Probing For Dialogue State Tracking (Huang et al., EANCS 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp22-frontmatter/2021.eancs-1.1.pdf

PDF Cite Search