VeReaFine: Iterative Verification Reasoning Refinement RAG for Hallucination-Resistant on Open-Ended Clinical QA

Pakawat Phasook, Rapepong Pitijaroonpong, Jiramet Kinchagawat, Amrest Chinkamol, Tossaporn Saengja, Kiartnarin Udomlapsakul, Jitkapat Sawatphol, Piyalitt Ittichaiwong


Abstract
We present VeReaFine, a novel “Verifier-RAG” pipeline designed to eliminate hallucinations in open-ended clinical question answering. VeReaFine interleaves three tightly coupled stages—retrieval, verification, and generation—across up to three iterations. First, a two-stage dense retriever (BM-Retriever-410M → BM-Reranker-2B) fetches and ranks top-k biomedical passages; an 8B-parameter MedReason verifier then filters these for direct relevance and identifies missing evidence. When the verifier deems the context insufficient, it formulates a focused “feedback query” to retrieve additional passages (bounded to prevent infinite loops). Once a minimal ground-truth context is assembled, a 7B-parameter generator (Qwen2.5-7B-Instruct) drafts an answer purely from that vetted context, and the verifier performs a final check—prompting the generator to refine any remaining unsupported claims. By iteratively fetching only missing facts and ensuring every assertion is evidence-backed, VeReaFine achieves monotonic factuality improvements with minimal overhead. On the BioNLP 2025 ClinIQLink “LLM Lie-Detector” shared task, our 7B generator augmented with VeReaFine matches or surpasses a 32B medical model on open-ended reasoning metrics, reducing multi-hop inverse step-identification errors by 26%. These findings demonstrate that moderate-size LLMs, when guided by targeted verification loops, can deliver expert-level reliability in clinical QA.
Anthology ID:
2025.bionlp-share.34
Volume:
BioNLP 2025 Shared Tasks
Month:
August
Year:
2025
Address:
Vienna, Austria
Editors:
Sarvesh Soni, Dina Demner-Fushman
Venues:
BioNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
281–288
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-share.34/
DOI:
Bibkey:
Cite (ACL):
Pakawat Phasook, Rapepong Pitijaroonpong, Jiramet Kinchagawat, Amrest Chinkamol, Tossaporn Saengja, Kiartnarin Udomlapsakul, Jitkapat Sawatphol, and Piyalitt Ittichaiwong. 2025. VeReaFine: Iterative Verification Reasoning Refinement RAG for Hallucination-Resistant on Open-Ended Clinical QA. In BioNLP 2025 Shared Tasks, pages 281–288, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
VeReaFine: Iterative Verification Reasoning Refinement RAG for Hallucination-Resistant on Open-Ended Clinical QA (Phasook et al., BioNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bionlp-share.34.pdf