DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension

Kai Sun; Dian Yu; Jianshu Chen; Dong Yu; Yejin Choi; Claire Cardie

doi:10.1162/tacl_a_00264

DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension

Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, Claire Cardie

Abstract

We present DREAM, the first dialogue-based multiple-choice reading comprehension data set. Collected from English as a Foreign Language examinations designed by human experts to evaluate the comprehension level of Chinese learners of English, our data set contains 10,197 multiple-choice questions for 6,444 dialogues. In contrast to existing reading comprehension data sets, DREAM is the first to focus on in-depth multi-turn multi-party dialogue understanding. DREAM is likely to present significant challenges for existing reading comprehension systems: 84% of answers are non-extractive, 85% of questions require reasoning beyond a single sentence, and 34% of questions also involve commonsense knowledge. We apply several popular neural reading comprehension models that primarily exploit surface information within the text and find them to, at best, just barely outperform a rule-based approach. We next investigate the effects of incorporating dialogue structure and different kinds of general world knowledge into both rule-based and (neural and non-neural) machine learning-based reading comprehension models. Experimental results on the DREAM data set show the effectiveness of dialogue structure and general world knowledge. DREAM is available at https://dataset.org/dream/.

Anthology ID:: Q19-1014
Volume:: Transactions of the Association for Computational Linguistics, Volume 7
Month:
Year:: 2019
Address:: Cambridge, MA
Editors:: Lillian Lee, Mark Johnson, Brian Roark, Ani Nenkova
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 217–231
Language:
URL:: https://aclanthology.org/Q19-1014
DOI:: 10.1162/tacl_a_00264
Bibkey:
Cite (ACL):: Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, and Claire Cardie. 2019. DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension. Transactions of the Association for Computational Linguistics, 7:217–231.
Cite (Informal):: DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension (Sun et al., TACL 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/ml4al-ingestion/Q19-1014.pdf
Data: DREAM, CoQA, ConceptNet, NarrativeQA

PDF Search