Taekyu Kang
2025
Bridging Information Gaps with Comprehensive Answers: Improving the Diversity and Informativeness of Follow-Up Questions
Zhe Liu
|
Taekyu Kang
|
Haoyu Wang
|
Seyed Hossein Alavi
|
Vered Shwartz
Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)
Generating diverse follow-up questions that uncover missing information remains challenging for conversational agents, particularly when they run on small, locally hosted models. To tackle this problem, we develop an information-gap–driven pipeline that contrasts the initial answer with an LLM-generated comprehensive answer, identifies the information gaps, and formulates gap-bridging follow-up questions. Applying the pipeline, we augment an existing dataset–FollowupQG–tenfold. Experiments show that models fine-tuned on the augmented dataset achieve significantly higher informativeness and diversity than variations trained on the original dataset. These findings indicate that our pipeline, which mirrors the human cognitive process of information seeking, provides an efficient distillation channel from state-of-the-art LLMs to smaller models, enabling resource-constrained conversational systems to generate more diverse and informative follow-up questions.