Cristina Arhiliuc
2026
MTEB-NL and E5-NL: Embedding Benchmark and Models for Dutch
Nikolay Banar | Ehsan Lotfi | Jens Van Nooten | Cristina Arhiliuc | Marija Kliocaite | Walter Daelemans
Findings of the Association for Computational Linguistics: ACL 2026
Nikolay Banar | Ehsan Lotfi | Jens Van Nooten | Cristina Arhiliuc | Marija Kliocaite | Walter Daelemans
Findings of the Association for Computational Linguistics: ACL 2026
Recently, embedding resources, including models, benchmarks, and datasets, have been widely released to support a variety of languages. However, the Dutch language remains underrepresented, typically comprising only a small fraction of the published multilingual resources. To address this gap and encourage the further development of Dutch embeddings, we introduce new resources for their evaluation and generation. First, we introduce the Massive Text Embedding Benchmark for Dutch (MTEB-NL), which includes both existing Dutch datasets and newly created ones, covering a wide range of tasks. Second, we provide a training dataset compiled from available Dutch retrieval datasets, complemented with synthetic data generated by large language models to expand task coverage beyond retrieval. Finally, we release a series of E5-NL compact yet efficient embedding models that demonstrate strong performance across multiple tasks. We make our resources publicly available through the Hugging Face Hub and the MTEB package.
2020
Language Proficiency Scoring
Cristina Arhiliuc | Jelena Mitrović | Michael Granitzer
Proceedings of the Twelfth Language Resources and Evaluation Conference
Cristina Arhiliuc | Jelena Mitrović | Michael Granitzer
Proceedings of the Twelfth Language Resources and Evaluation Conference
The Common European Framework of Reference (CEFR) provides generic guidelines for the evaluation of language proficiency. Nevertheless, for automated proficiency classification systems, different approaches for different languages are proposed. Our paper evaluates and extends the results of an approach to Automatic Essay Scoring proposed as a part of the REPROLANG 2020 challenge. We provide a comparison between our results and the ones from the published paper and we include a new corpus for the English language for further experiments. Our results are lower than the expected ones when using the same approach and the system does not scale well with the added English corpus.