Abstract
Nowadays, large language models (LLMs) have demonstrated their ability to be a powerful knowledge generator of generate-then-read paradigm for open-domain question answering (ODQA). However this new paradigm mainly suffers from the “hallucination” and struggles to handle time-sensitive issue because of its expensive knowledge update costs. On the other hand, retrieve-then-read, as a traditional paradigm, is more limited by the relevance of acquired knowledge to the given question. In order to combine the strengths of both paradigms, and overcome their respective shortcomings, we design a new pipeline called “FlexiQA”, in which we utilize the diverse evaluation capabilities of LLMs to select knowledge effectively and flexibly. First, given a question, we prompt a LLM as a discriminator to identify whether it is time-sensitive. For time-sensitive questions, we follow the retrieve-then-read paradigm to obtain the answer. For the non time-sensitive questions, we further prompt the LLM as an evaluator to select a better document from two perspectives: factuality and relevance. Based on the selected document, we leverage a reader to get the final answer. We conduct extensive experiments on three widely-used ODQA benchmarks, the experimental results fully confirm the effectiveness of our approach.