Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval

Matei Benescu; Ivo Pascal De Jong

Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval

Abstract

With the emergence of Large Language Models (LLMs), new methods in Information Retrieval are available in which relevance is estimated directly through language understanding and reasoning, instead of embedding similarity. We argue that similarity is a short-sighted interpretation of relevance, and that LLM-Based Relevance Judgment Systems (LLM-RJS) (with reasoning) have potential to outperform Neural Embedding Retrieval Systems (NERS) by overcoming this limitation. Using the TREC-DL 2019 passage retrieval dataset, we compare various LLM-RJS with NERS, but observe no noticeable improvement. Subsequently, we analyze the impact of reasoning by comparing LLM-RJS with and without reasoning. We find that human annotations also suffer from short-sightedness, and that false-positives in the reasoning LLM-RJS are primarily mistakes in annotations due to short-sightedness. We conclude that LLM-RJS do have the ability to address the short-sightedness limitation in NERS, but that this cannot be evaluated with standard annotated relevance datasets.

Anthology ID:: 2026.acl-srw.5
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 47–59
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-srw.5/
DOI:
Bibkey:
Cite (ACL):: Matei Benescu and Ivo Pascal de Jong. 2026. Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 47–59, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval (Benescu & de Jong, ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-srw.5.pdf

PDF Cite Search Fix data