Is the Answer in the Text? Challenging ChatGPT with Evidence Retrieval from Instructive Text
Sophie Henning, Talita Anthonio, Wei Zhou, Heike Adel, Mohsen Mesgar, Annemarie Friedrich
Abstract
Generative language models have recently shown remarkable success in generating answers to questions in a given textual context. However, these answers may suffer from hallucination, wrongly cite evidence, and spread misleading information. In this work, we address this problem by employing ChatGPT, a state-of-the-art generative model, as a machine-reading system. We ask it to retrieve answers to lexically varied and open-ended questions from trustworthy instructive texts. We introduce WHERE (WikiHow Evidence REtrieval), a new high-quality evaluation benchmark of a set of WikiHow articles exhaustively annotated with evidence sentences to questions that comes with a special challenge: All questions are about the article’s topic, but not all can be answered using the provided context. We interestingly find that when using a regular question-answering prompt, ChatGPT neglects to detect the unanswerable cases. When provided with a few examples, it learns to better judge whether a text provides answer evidence or not. Alongside this important finding, our dataset defines a new benchmark for evidence retrieval in question answering, which we argue is one of the necessary next steps for making large language models more trustworthy.- Anthology ID:
- 2023.findings-emnlp.949
- Original:
- 2023.findings-emnlp.949v1
- Version 2:
- 2023.findings-emnlp.949v2
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 14229–14241
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.949
- DOI:
- Cite (ACL):
- Sophie Henning, Talita Anthonio, Wei Zhou, Heike Adel, Mohsen Mesgar, and Annemarie Friedrich. 2023. Is the Answer in the Text? Challenging ChatGPT with Evidence Retrieval from Instructive Text. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 14229–14241, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Is the Answer in the Text? Challenging ChatGPT with Evidence Retrieval from Instructive Text (Henning et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/emnlp-22-attachments/2023.findings-emnlp.949.pdf