Andrei Ungureanu


2025

pdf bib
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval
Michael Günther | Saba Sturua | Mohammad Kalim Akram | Isabelle Mohr | Andrei Ungureanu | Bo Wang | Sedigheh Eslami | Scott Martens | Maximilian Werk | Nan Wang | Han Xiao
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)

We introduce jina-embeddings-v4, a 3.8 billion parameter embedding model that unifies text and image representations, with a novel architecture supporting both single-vector and multi-vector embeddings. It achieves high performance on both single-modal and cross-modal retrieval tasks, and is particularly strong in processing visually rich content such as tables, charts, diagrams, and mixed-media formats that incorporate both image and textual information. We also introduce JVDR, a novel benchmark for visually rich document retrieval that includes more diverse materials and query types than previous efforts. We use JVDR to show that jina-embeddings-v4 greatly improves on state-of-the-art performance for these kinds of tasks.