Abstract
Reading comprehension QA tasks have seen a recent surge in popularity, yet most works have focused on fact-finding extractive QA. We instead focus on a more challenging multi-hop generative task (NarrativeQA), which requires the model to reason, gather, and synthesize disjoint pieces of information within the context to generate an answer. This type of multi-step reasoning also often requires understanding implicit relations, which humans resolve via external, background commonsense knowledge. We first present a strong generative baseline that uses a multi-attention mechanism to perform multiple hops of reasoning and a pointer-generator decoder to synthesize the answer. This model performs substantially better than previous generative models, and is competitive with current state-of-the-art span prediction models. We next introduce a novel system for selecting grounded multi-hop relational commonsense information from ConceptNet via a pointwise mutual information and term-frequency based scoring function. Finally, we effectively use this extracted commonsense information to fill in gaps of reasoning between context hops, using a selectively-gated attention mechanism. This boosts the model’s performance significantly (also verified via human evaluation), establishing a new state-of-the-art for the task. We also show that our background knowledge enhancements are generalizable and improve performance on QAngaroo-WikiHop, another multi-hop reasoning dataset.- Anthology ID:
- D18-1454
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4220–4230
- Language:
- URL:
- https://aclanthology.org/D18-1454
- DOI:
- 10.18653/v1/D18-1454
- Cite (ACL):
- Lisa Bauer, Yicheng Wang, and Mohit Bansal. 2018. Commonsense for Generative Multi-Hop Question Answering Tasks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4220–4230, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Commonsense for Generative Multi-Hop Question Answering Tasks (Bauer et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/D18-1454.pdf
- Code
- yicheng-w/CommonSenseMultiHopQA + additional community code
- Data
- NarrativeQA, SQuAD, WikiHop