BERT-based distractor generation for Swedish reading comprehension questions using a small-scale dataset

Dmytro Kalpakchi, Johan Boye


Abstract
An important part when constructing multiple-choice questions (MCQs) for reading comprehension assessment are the distractors, the incorrect but preferably plausible answer options. In this paper, we present a new BERT-based method for automatically generating distractors using only a small-scale dataset. We also release a new such dataset of Swedish MCQs (used for training the model), and propose a methodology for assessing the generated distractors. Evaluation shows that from a student’s perspective, our method generated one or more plausible distractors for more than 50% of the MCQs in our test set. From a teacher’s perspective, about 50% of the generated distractors were deemed appropriate. We also do a thorough analysis of the results.
Anthology ID:
2021.inlg-1.43
Volume:
Proceedings of the 14th International Conference on Natural Language Generation
Month:
August
Year:
2021
Address:
Aberdeen, Scotland, UK
Editors:
Anya Belz, Angela Fan, Ehud Reiter, Yaji Sripada
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
387–403
Language:
URL:
https://aclanthology.org/2021.inlg-1.43
DOI:
10.18653/v1/2021.inlg-1.43
Bibkey:
Cite (ACL):
Dmytro Kalpakchi and Johan Boye. 2021. BERT-based distractor generation for Swedish reading comprehension questions using a small-scale dataset. In Proceedings of the 14th International Conference on Natural Language Generation, pages 387–403, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Cite (Informal):
BERT-based distractor generation for Swedish reading comprehension questions using a small-scale dataset (Kalpakchi & Boye, INLG 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/2021.inlg-1.43.pdf
Code
 dkalpakchi/swequad-mc