PROBES : Performance and Relevance Observation for BEtter Search
Sejal Jain, Cyrus Andre DSouza, Jitenkumar Babubhai Rana, Aniket Joshi, Promod Yenigalla
Abstract
High-quality search is essential for the success of online platforms, spanning e-commerce, social media, shopping-focused applications, and broader search systems such as content discovery and enterprise web search. To ensure optimal user experience and drive business growth, continuous evaluation and improvement of search systems is crucial. This paper introduces PROBES, a novel multi-task system powered by Large Language Models (LLMs) designed for end-to-end evaluation of semantic search systems. PROBES identifies context-aware relevance using a fine-grained scale (exact, substitute, complement, irrelevant) by leveraging the query category, feature-level intent, and category-aware feature importance, enabling more precise and consistent judgments than relying solely on raw query text. This allows PROBES to provide differentiated relevance assessment across a diverse range of query categories. PROBES then dives deeper to understand the reason behind irrelevant results (Precision issues) by checking product content conflicts and inaccuracies. It also analyzes Missed Recall by leveraging retrieval and relevance models to determine whether a missed recall was due to a selection issue or a ranking/retrieval system issue. To evaluate PROBES, we introduce a new metric, the Actionable Error Rate (AER), defined as the proportion of actionable errors over all flagged errors. We observe that PROBES operates at an AER of 76%, generating actionable insights across 100 product categories.- Anthology ID:
- 2026.eacl-industry.48
- Volume:
- Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 625–635
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.48/
- DOI:
- Cite (ACL):
- Sejal Jain, Cyrus Andre DSouza, Jitenkumar Babubhai Rana, Aniket Joshi, and Promod Yenigalla. 2026. PROBES : Performance and Relevance Observation for BEtter Search. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 625–635, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- PROBES : Performance and Relevance Observation for BEtter Search (Jain et al., EACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.48.pdf