TripleCheck: Transparent Post-Hoc Verification of Biomedical Claims in AI-Generated Answers

Ana Valeria González, Sidsel Boldsen, Roland Hangelbroek


Abstract
Retrieval Augmented Generation (RAG) has advanced Question Answering (QA) by connecting Large Language Models (LLMs) to external knowledge. However, these systems can still produce answers that are unsupported, lack clear traceability, or misattribute information — a critical issue in the biomedical domain where accuracy, trust and control are essential. We introduce TripleCheck, a post-hoc framework that breaks down an LLM’s answer into factual triples and checks each against both the retrieved context and a biomedical knowledge graph. By highlighting which statements are supported, traceable, or correctly attributed, TripleCheck enables users to spot gaps, unsupported claims, and misattributions, prompting more careful follow up. We present the TripleCheck framework, evaluate it on the SciFact benchmark, analyze its limitations, and share preliminary expert feedback. Results show that TripleCheck provides nuanced insight, potentially supporting greater trust and safer AI adoption in biomedical applications.
Anthology ID:
2025.hcinlp-1.4
Volume:
Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Su Lin Blodgett, Amanda Cercas Curry, Sunipa Dev, Siyan Li, Michael Madaio, Jack Wang, Sherry Tongshuang Wu, Ziang Xiao, Diyi Yang
Venues:
HCINLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–47
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.hcinlp-1.4/
DOI:
Bibkey:
Cite (ACL):
Ana Valeria González, Sidsel Boldsen, and Roland Hangelbroek. 2025. TripleCheck: Transparent Post-Hoc Verification of Biomedical Claims in AI-Generated Answers. In Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP), pages 33–47, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
TripleCheck: Transparent Post-Hoc Verification of Biomedical Claims in AI-Generated Answers (González et al., HCINLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.hcinlp-1.4.pdf