Yao-Chung Fan


2023

pdf
Distractor Generation based on Text2Text Language Models with Pseudo Kullback-Leibler Divergence Regulation
Hui-Juan Wang | Kai-Yu Hsieh | Han-Cheng Yu | Jui-Ching Tsou | Yu An Shih | Chen-Hua Huang | Yao-Chung Fan
Findings of the Association for Computational Linguistics: ACL 2023

In this paper, we address the task of cloze-style multiple choice question (MCQs) distractor generation. Our study is featured by the following designs. First, we propose to formulate the cloze distractor generation as a Text2Text task. Second, we propose pseudo Kullback-Leibler Divergence for regulating the generation to consider the item discrimination index in education evaluation. Third, we explore the candidate augmentation strategy and multi-tasking training with cloze-related tasks to further boost the generation performance. Through experiments with benchmarking datasets, our best perfomring model advances the state-of-the-art result from 10.81 to 22.00 (p@1 score).

2022

pdf
Keyword Provision Question Generation for Facilitating Educational Reading Comprehension Preparation
Ying-Hong Chan | Ho-Lam Chung | Yao-Chung Fan
Proceedings of the 15th International Conference on Natural Language Generation

pdf
CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model
Shang-Hsuan Chiang | Ssu-Cheng Wang | Yao-Chung Fan
Findings of the Association for Computational Linguistics: EMNLP 2022

Manually designing cloze test consumes enormous time and efforts. The major challenge lies in wrong option (distractor) selection. Having carefully-design distractors improves the effectiveness of learner ability assessment. As a result, the idea of automatically generating cloze distractor is motivated. In this paper, we investigate cloze distractor generation by exploring the employment of pre-trained language models (PLMs) as an alternative for candidate distractor generation. Experiments show that the PLM-enhanced model brings a substantial performance improvement. Our best performing model advances the state-of-the-art result from 14.94 to 34.17 (NDCG@10 score). Our code and dataset is available at https://github.com/AndyChiangSH/CDGP.

2020

pdf
A BERT-based Distractor Generation Scheme with Multi-tasking and Negative Answer Training Strategies.
Ho-Lam Chung | Ying-Hong Chan | Yao-Chung Fan
Findings of the Association for Computational Linguistics: EMNLP 2020

In this paper, we investigate the following two limitations for the existing distractor generation (DG) methods. First, the quality of the existing DG methods are still far from practical use. There are still room for DG quality improvement. Second, the existing DG designs are mainly for single distractor generation. However, for practical MCQ preparation, multiple distractors are desired. Aiming at these goals, in this paper, we present a new distractor generation scheme with multi-tasking and negative answer training strategies for effectively generating multiple distractors. The experimental results show that (1) our model advances the state-of-the-art result from 28.65 to 39.81 (BLEU 1 score) and (2) the generated multiple distractors are diverse and shows strong distracting power for multiple choice question.

2019

pdf
BERT for Question Generation
Ying-Hong Chan | Yao-Chung Fan
Proceedings of the 12th International Conference on Natural Language Generation

In this study, we investigate the employment of the pre-trained BERT language model to tackle question generation tasks. We introduce two neural architectures built on top of BERT for question generation tasks. The first one is a straightforward BERT employment, which reveals the defects of directly using BERT for text generation. And, the second one remedies the first one by restructuring the BERT employment into a sequential manner for taking information from previous decoded results. Our models are trained and evaluated on the question-answering dataset SQuAD. Experiment results show that our best model yields state-of-the-art performance which advances the BLEU4 score of existing best models from 16.85 to 18.91.

pdf
A Recurrent BERT-based Model for Question Generation
Ying-Hong Chan | Yao-Chung Fan
Proceedings of the 2nd Workshop on Machine Reading for Question Answering

In this study, we investigate the employment of the pre-trained BERT language model to tackle question generation tasks. We introduce three neural architectures built on top of BERT for question generation tasks. The first one is a straightforward BERT employment, which reveals the defects of directly using BERT for text generation. Accordingly, we propose another two models by restructuring our BERT employment into a sequential manner for taking information from previous decoded results. Our models are trained and evaluated on the recent question-answering dataset SQuAD. Experiment results show that our best model yields state-of-the-art performance which advances the BLEU 4 score of the existing best models from 16.85 to 22.17.