Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes

Kaige Xie; Sarah Wiegreffe; Mark Riedl

doi:10.18653/v1/2022.findings-emnlp.209

Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes

Abstract

Multi-hop Question Answering (QA) is a challenging task since it requires an accurate aggregation of information from multiple context paragraphs and a thorough understanding of the underlying reasoning chains. Recent work in multi-hop QA has shown that performance can be boosted by first decomposing the questions into simpler, single-hop questions. In this paper, we explore one additional utility of the multi-hop decomposition from the perspective of explainable NLP: to create explanation by probing a neural QA model with them. We hypothesize that in doing so, users will be better able to predict when the underlying QA system will give the correct answer. Through human participant studies, we verify that exposing the decomposition probes and answers to the probes to users can increase their ability to predict system performance on a question instance basis. We show that decomposition is an effective form of probing QA systems as well as a promising approach to explanation generation. In-depth analyses show the need for improvements in decomposition systems.

Anthology ID:: 2022.findings-emnlp.209
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2022
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2888–2902
Language:
URL:: https://aclanthology.org/2022.findings-emnlp.209
DOI:: 10.18653/v1/2022.findings-emnlp.209
Bibkey:
Cite (ACL):: Kaige Xie, Sarah Wiegreffe, and Mark Riedl. 2022. Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2888–2902, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes (Xie et al., Findings 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/dois-2013-emnlp/2022.findings-emnlp.209.pdf
Video:: https://preview.aclanthology.org/dois-2013-emnlp/2022.findings-emnlp.209.mp4

PDF Search Video