Abstract
We develop a technique for transfer learning in machine comprehension (MC) using a novel two-stage synthesis network. Given a high performing MC model in one domain, our technique aims to answer questions about documents in another domain, where we use no labeled data of question-answer pairs. Using the proposed synthesis network with a pretrained model on the SQuAD dataset, we achieve an F1 measure of 46.6% on the challenging NewsQA dataset, approaching performance of in-domain models (F1 measure of 50.0%) and outperforming the out-of-domain baseline by 7.6%, without use of provided annotations.- Anthology ID:
 - D17-1087
 - Volume:
 - Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
 - Month:
 - September
 - Year:
 - 2017
 - Address:
 - Copenhagen, Denmark
 - Venue:
 - EMNLP
 - SIG:
 - SIGDAT
 - Publisher:
 - Association for Computational Linguistics
 - Note:
 - Pages:
 - 835–844
 - Language:
 - URL:
 - https://aclanthology.org/D17-1087
 - DOI:
 - 10.18653/v1/D17-1087
 - Cite (ACL):
 - David Golub, Po-Sen Huang, Xiaodong He, and Li Deng. 2017. Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 835–844, Copenhagen, Denmark. Association for Computational Linguistics.
 - Cite (Informal):
 - Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension (Golub et al., EMNLP 2017)
 - PDF:
 - https://preview.aclanthology.org/ingestion-script-update/D17-1087.pdf
 - Code
 - davidgolub/QuestionGeneration + additional community code
 - Data
 - MS MARCO, NewsQA, SQuAD