Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models

Raphael Tang, Crystina Zhang, Xueguang Ma, Jimmy Lin, Ferhan Ture


Abstract
Large language models (LLMs) exhibit positional bias in how they use context, which especially affects listwise ranking. To address this, we propose permutation self-consistency, a form of self-consistency over the ranking list outputs of black-box LLMs. Our key idea is to marginalize out different list orders in the prompt to produce an order-independent ranking with less positional bias. First, given some input prompt, we repeatedly shuffle the list in the prompt and pass it through the LLM while holding the instructions the same. Next, we aggregate the resulting sample of rankings by computing the central ranking closest in distance to all of them, marginalizing out prompt order biases in the process. Theoretically, we prove the robustness of our method, showing convergence to the true ranking under random perturbations.Empirically, on five datasets in sorting and passage reranking, our approach improves scores from conventional inference by up to 34-52% for Mistral, 7-18% for GPT-3.5, 8-16% for LLaMA v2 (70B). Our code is at https://github.com/castorini/perm-sc.
Anthology ID:
2024.naacl-long.129
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2327–2340
Language:
URL:
https://preview.aclanthology.org/icon-24-ingestion/2024.naacl-long.129/
DOI:
10.18653/v1/2024.naacl-long.129
Bibkey:
Cite (ACL):
Raphael Tang, Crystina Zhang, Xueguang Ma, Jimmy Lin, and Ferhan Ture. 2024. Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 2327–2340, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models (Tang et al., NAACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/icon-24-ingestion/2024.naacl-long.129.pdf