Question Answering with Long Multiple-Span Answers

Ming Zhu, Aman Ahuja, Da-Cheng Juan, Wei Wei, Chandan K. Reddy


Abstract
Answering questions in many real-world applications often requires complex and precise information excerpted from texts spanned across a long document. However, currently no such annotated dataset is publicly available, which hinders the development of neural question-answering (QA) systems. To this end, we present MASH-QA, a Multiple Answer Spans Healthcare Question Answering dataset from the consumer health domain, where answers may need to be excerpted from multiple, non-consecutive parts of text spanned across a long document. We also propose MultiCo, a neural architecture that is able to capture the relevance among multiple answer spans, by using a query-based contextualized sentence selection approach, for forming the answer to the given question. We also demonstrate that conventional QA models are not suitable for this type of task and perform poorly in this setting. Extensive experiments are conducted, and the experimental results confirm the proposed model significantly outperforms the state-of-the-art QA models in this multi-span QA setting.
Anthology ID:
2020.findings-emnlp.342
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3840–3849
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.342
DOI:
10.18653/v1/2020.findings-emnlp.342
Bibkey:
Cite (ACL):
Ming Zhu, Aman Ahuja, Da-Cheng Juan, Wei Wei, and Chandan K. Reddy. 2020. Question Answering with Long Multiple-Span Answers. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3840–3849, Online. Association for Computational Linguistics.
Cite (Informal):
Question Answering with Long Multiple-Span Answers (Zhu et al., Findings 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2020.findings-emnlp.342.pdf
Code
 mingzhu0527/mashqa
Data
CliCRELI5MedQuADNatural QuestionsSQuADWikiQAemrQA