Mayank Waghmare


2026

Retrieval strategy selection is a critical but understudied design decision in biomedical RAG systems. Existing evaluations rely on lexical metrics that miss answer grounding, or require proprietary infrastructure that limits reproducibility. We present BioRAG, a head-to-head ablation of seven retrieval strategies on BioASQ-13b, evaluated using four RAGAs metrics with a locally deployed judge at zero monetary cost. Hybrid BM25 plus dense retrieval with Reciprocal Rank Fusion achieves faithfulness of 0.534 and context recall of 0.507, improvements of 50% and 85% over naive dense retrieval, confirmed across three random seed re-samples. HyDE improves faithfulness by 14% but reduces context precision by 52%, a tradeoff not previously documented on BioASQ. No single strategy dominates all four metrics, indicating that strategy selection must be application-driven. Sensitivity analysis across k in {3,5,10} confirms ranking stability. A domain mismatch diagnostic confirms 2% corpus coverage failure. The full pipeline runs on consumer hardware without paid APIs, directly addressing BioNLP 2026’s emphasis on reproducibility and evaluation frameworks for health-related applications.