Abstract
Question answering (QA) models for reading comprehension have achieved human-level accuracy on in-distribution test sets. However, they have been demonstrated to lack robustness to challenge sets, whose distribution is different from that of training sets. Existing data augmentation methods mitigate this problem by simply augmenting training sets with synthetic examples sampled from the same distribution as the challenge sets. However, these methods assume that the distribution of a challenge set is known a priori, making them less applicable to unseen challenge sets. In this study, we focus on question-answer pair generation (QAG) to mitigate this problem. While most existing QAG methods aim to improve the quality of synthetic examples, we conjecture that diversity-promoting QAG can mitigate the sparsity of training sets and lead to better robustness. We present a variational QAG model that generates multiple diverse QA pairs from a paragraph. Our experiments show that our method can improve the accuracy of 12 challenge sets, as well as the in-distribution accuracy.- Anthology ID:
- 2021.acl-srw.21
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Jad Kabbara, Haitao Lin, Amandalynne Paullada, Jannis Vamvas
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 197–214
- Language:
- URL:
- https://aclanthology.org/2021.acl-srw.21
- DOI:
- 10.18653/v1/2021.acl-srw.21
- Cite (ACL):
- Kazutoshi Shinoda, Saku Sugawara, and Akiko Aizawa. 2021. Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop, pages 197–214, Online. Association for Computational Linguistics.
- Cite (Informal):
- Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation (Shinoda et al., ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2021.acl-srw.21.pdf
- Code
- KazutoshiShinoda/VQAG
- Data
- Natural Questions, NewsQA, SQuAD