Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers

Zhiyuan Peng, Ting-Ruen Wei, Tingyu Song, Yilun Zhao


Abstract
Large Language Models (LLMs) have recently been applied to reranking tasks in information retrieval, achieving strong performance. However, their high computational demands often hinder practical deployment.Existing studies evaluate the efficiency of LLM-based rerankers using proxy metrics such as latency, the number of forward passes, input tokens, and output tokens. However, these metrics depend on hardware and running-time choices (parallel or not, batch size, etc), and often fail to account for model size, making it difficult to interpret and obscuring the evaluation of the efficiency-effectiveness tradeoff. To address this issue, we propose for LLM-based rerankers: RPP (ranking metrics per PetaFLOP), measuring how much ranking quality (e.g., NDCG or MRR) a method achieves per PetaFLOP, and QPP (queries per PetaFLOP), measuring how many queries can be processed per PetaFLOP. Accompanied by the new metrics, an interpretable FLOPs estimator is developed to estimate the FLOPs of an LLM-based reranker even without running any experiments. Based on the proposed metrics, we conduct comprehensive experiments to evaluate a wide range of LLM-based rerankers with different architectures, studying the efficiency-effectiveness trade-off and bringing this issue to the attention of the research community.
Anthology ID:
2025.emnlp-industry.186
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
November
Year:
2025
Address:
Suzhou (China)
Editors:
Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2782–2791
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.186/
DOI:
Bibkey:
Cite (ACL):
Zhiyuan Peng, Ting-Ruen Wei, Tingyu Song, and Yilun Zhao. 2025. Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 2782–2791, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):
Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers (Peng et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.186.pdf