MMLF: Multi-query Multi-passage Late Fusion Retrieval

Yuan-Ching Kuo, Yi Yu, Chih-Ming Chen, Chuan-Ju Wang


Abstract
Leveraging large language models (LLMs) for query expansion has proven highly effective across diverse tasks and languages. Yet, challenges remain in optimizing query formatting and prompting, often with less focus on handling retrieval results. In this paper, we introduce Multi-query Multi-passage Late Fusion (MMLF), a straightforward yet potent pipeline that generates sub-queries, expands them into pseudo-documents, retrieves them individually, and aggregates results using reciprocal rank fusion. Our experiments demonstrate that MMLF exhibits superior performance across five BEIR benchmark datasets, achieving an average improvement of 4% and a maximum gain of up to 8% in both Recall@1k and nDCG@10 compared to state of the art across BEIR information retrieval datasets.
Anthology ID:
2025.findings-naacl.367
Volume:
Findings of the Association for Computational Linguistics: NAACL 2025
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6587–6598
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.367/
DOI:
Bibkey:
Cite (ACL):
Yuan-Ching Kuo, Yi Yu, Chih-Ming Chen, and Chuan-Ju Wang. 2025. MMLF: Multi-query Multi-passage Late Fusion Retrieval. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 6587–6598, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
MMLF: Multi-query Multi-passage Late Fusion Retrieval (Kuo et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.367.pdf