A Multi-answer Multi-task Framework for Real-world Machine Reading Comprehension

Jiahua Liu, Wan Wei, Maosong Sun, Hao Chen, Yantao Du, Dekang Lin


Abstract
The task of machine reading comprehension (MRC) has evolved from answering simple questions from well-edited text to answering real questions from users out of web data. In the real-world setting, full-body text from multiple relevant documents in the top search results are provided as context for questions from user queries, including not only questions with a single, short, and factual answer, but also questions about reasons, procedures, and opinions. In this case, multiple answers could be equally valid for a single question and each answer may occur multiple times in the context, which should be taken into consideration when we build MRC system. We propose a multi-answer multi-task framework, in which different loss functions are used for multiple reference answers. Minimum Risk Training is applied to solve the multi-occurrence problem of a single answer. Combined with a simple heuristic passage extraction strategy for overlong documents, our model increases the ROUGE-L score on the DuReader dataset from 44.18, the previous state-of-the-art, to 51.09.
Anthology ID:
D18-1235
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2109–2118
Language:
URL:
https://aclanthology.org/D18-1235
DOI:
10.18653/v1/D18-1235
Bibkey:
Cite (ACL):
Jiahua Liu, Wan Wei, Maosong Sun, Hao Chen, Yantao Du, and Dekang Lin. 2018. A Multi-answer Multi-task Framework for Real-world Machine Reading Comprehension. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2109–2118, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
A Multi-answer Multi-task Framework for Real-world Machine Reading Comprehension (Liu et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D18-1235.pdf
Data
MS MARCO