Varja-Dominators at MedGenVidQA 2026: Hybrid Video and Document Retrieval using PubMedBERT, T5 Query Expansion, and Cross-Encoder Re-Ranking

Pratik Dhaktode, Suhani Bighane, Anupama Phakatkar


Abstract
This paper presents a system for Task A of the MedGenVidQA 2026 shared task, which requires simultaneously retrieving relevant PubMed documents and medical videos for 60 consumer health topics. The core contribution is a unified multi-stage pipeline that treats video and document retrieval as complementary rather than independent problems.For video retrieval, the system fine-tunes a PubMedBERT bi-encoder on 2,710 MedVidQA training samples using BM25-driven hard negative mining. Video transcripts (833 unique videos) are segmented into overlapping 30-second temporal chunks with a 10-second stride, producing 32,489 indexed chunks. At query time, T5-based query expansion generates enriched queries for BM25 sparse retrieval, while the original query drives FAISS dense retrieval. The two ranked lists are fused via weighted Reciprocal Rank Fusion (RRF, dense weight 0.75, sparse weight 0.25), and a cross-encoder (MiniLM-L-6-v2) re-ranks the top-200 fused candidates to produce the final top-10 videos. For document retrieval, the NCBI PubMed ESearch API is queried using a progressive keyword fallback chain with exponential backoff, ensuring full topic coverage.The system achieves a MAP of 0.3898, Recall@10 of 0.8449, and NDCG@10 of 0.1079, with complete 60/60 topic coverage across both retrieval modalities. Key limitations include reliance solely on transcript text for video retrieval (no visual or audio features) and dependence on a live API for document retrieval.
Anthology ID:
2026.bionlp-2.32
Volume:
Proceedings of the BioNLP 2026 (Shared Tasks)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Deepak Gupta, Dina Demner-Fushman
Venues:
BioNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
243–247
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-2.32/
DOI:
Bibkey:
Cite (ACL):
Pratik Dhaktode, Suhani Bighane, and Anupama Phakatkar. 2026. Varja-Dominators at MedGenVidQA 2026: Hybrid Video and Document Retrieval using PubMedBERT, T5 Query Expansion, and Cross-Encoder Re-Ranking. In Proceedings of the BioNLP 2026 (Shared Tasks), pages 243–247, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Varja-Dominators at MedGenVidQA 2026: Hybrid Video and Document Retrieval using PubMedBERT, T5 Query Expansion, and Cross-Encoder Re-Ranking (Dhaktode et al., BioNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-2.32.pdf
Supplementarymaterial:
 2026.bionlp-2.32.SupplementaryMaterial.zip
Supplementarymaterial:
 2026.bionlp-2.32.SupplementaryMaterial.txt