Chih-Ming Chen
2025
MMLF: Multi-query Multi-passage Late Fusion Retrieval
Yuan-Ching Kuo
|
Yi Yu
|
Chih-Ming Chen
|
Chuan-Ju Wang
Findings of the Association for Computational Linguistics: NAACL 2025
Leveraging large language models (LLMs) for query expansion has proven highly effective across diverse tasks and languages. Yet, challenges remain in optimizing query formatting and prompting, often with less focus on handling retrieval results. In this paper, we introduce Multi-query Multi-passage Late Fusion (MMLF), a straightforward yet potent pipeline that generates sub-queries, expands them into pseudo-documents, retrieves them individually, and aggregates results using reciprocal rank fusion. Our experiments demonstrate that MMLF exhibits superior performance across five BEIR benchmark datasets, achieving an average improvement of 4% and a maximum gain of up to 8% in both Recall@1k and nDCG@10 compared to state of the art across BEIR information retrieval datasets.