NEAT-IR: Neural Explainable Analysis Tool for Information Retrieval

Lev Sukherman, Artem Frenk, Nina Klimenkova, Connor Jason


Abstract
Neural IR models achieve strong performance but remain difficult to interpret. We present NEAT-IR, a black-box analysis framework that explains ColBERT’s ranking behavior using 26 classical IR features (BM25, TF-IDF, IDF measures, positional signals). We analyze ColBERT through two complementary lenses: regression (predicting exact scores) and learning-to-rank (predicting relative order), evaluated on MS MARCO (48,250 query-passage pairs). Our key finding is a score-rank gap: classical features preserve ColBERT’s rankings nearly perfectly (NDCG@5 ≈ 0.99) yet explain only R2 ≈ 0.28 of score variance. Feature attribution reveals that regression and ranking models rely on distinct feature subsets: query-level IDF signals dominate score prediction, while document-matching features (BM25, cosine TF-IDF) drive ranking preservation. These findings suggest that ColBERT’s ordinal behavior on MS MARCO is largely recoverable from classical signals, while neural contributions primarily affect score magnitude. NEAT-IR enables practitioners to diagnose when neural rankers deviate from classical patterns, supporting interpretable model auditing and informed hybrid pipeline design.
Anthology ID:
2026.acl-srw.21
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
240–246
Language:
URL:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.21/
DOI:
Bibkey:
Cite (ACL):
Lev Sukherman, Artem Frenk, Nina Klimenkova, and Connor Jason. 2026. NEAT-IR: Neural Explainable Analysis Tool for Information Retrieval. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 240–246, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
NEAT-IR: Neural Explainable Analysis Tool for Information Retrieval (Sukherman et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.21.pdf