Jingzhou Jiang

2026

FLARE: Task-Agnostic Embedding Model Evaluation via Normalizing Flows
Jingzhou Jiang | Yixuan Tang | Yi Yang | Kar Yan Tam
Findings of the Association for Computational Linguistics: ACL 2026

Despite the widespread adoption of text embedding models, selecting the optimal model for a specific target corpus remains challenging due to the lack of task-specific labels. While task-agnostic evaluation offers a promising solution by relying on unlabeled data, existing approaches based on kernel estimators or Gaussian mixtures fail to model high-dimensional distributions effectively, resulting in unstable rankings. To address this limitation, we propose FLARE (Flow-based Label-free Assessment of Representation Embeddings), which employs normalizing flows to estimate information sufficiency in high-dimensional spaces. By learning invertible transformations, flows enable exact density estimation while mitigating the instability inherent in distance-based methods. We provide theoretical guarantees showing that our estimation error depends on the data’s intrinsic structure rather than its raw dimensionality. Experiments across 11 datasets demonstrate that FLARE achieves a strong Spearman’s ρ (up to 0.90) with supervised benchmarks, remaining robust even for high-dimensional embeddings (d ≥ 3,584).

Co-authors

Venues

Findings1

Fix author