Evaluating Theory of Mind in Question Answering

Aida Nematzadeh, Kaylee Burns, Erin Grant, Alison Gopnik, Tom Griffiths


Abstract
We propose a new dataset for evaluating question answering models with respect to their capacity to reason about beliefs. Our tasks are inspired by theory-of-mind experiments that examine whether children are able to reason about the beliefs of others, in particular when those beliefs differ from reality. We evaluate a number of recent neural models with memory augmentation. We find that all fail on our tasks, which require keeping track of inconsistent states of the world; moreover, the models’ accuracy decreases notably when random sentences are introduced to the tasks at test.
Anthology ID:
D18-1261
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2392–2400
Language:
URL:
https://aclanthology.org/D18-1261
DOI:
10.18653/v1/D18-1261
Bibkey:
Cite (ACL):
Aida Nematzadeh, Kaylee Burns, Erin Grant, Alison Gopnik, and Tom Griffiths. 2018. Evaluating Theory of Mind in Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2392–2400, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Evaluating Theory of Mind in Question Answering (Nematzadeh et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/D18-1261.pdf
Video:
 https://preview.aclanthology.org/naacl24-info/D18-1261.mp4
Code
 kayburns/tom-qa-dataset +  additional community code
Data
ToM QA