Zahra Saaberi - ACL Anthology

This page is part of a temporary preview of a proposed change that may be incomplete or contain mistakes. It is not official and will be removed when the change is merged or abandoned.

Zahra Saaberi

2026

A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding
Dilara Torunoğlu-Selamet | Doğukan Arslan | Rodrigo Wilkens | Wei He | Doruk Eryiğit | Thomas Pickard | Adriana S. Pagano | Aline Villavicencio | Gülşen Eryiğit | Ágnes Abuczki | Aida Cardoso | Alesia Lazarenka | Dina Almassova | Amália Mendes | Anna Kanellopoulou | Antoni Brosa-Rodriguez | Baiba Valkovska | Beata Wojtowicz | Bolette Pedersen | Carlos Manuel Hidalgo-Ternero | Chaya Liebeskind | Danka Jokić | Diego Alves | Eleni Triantafyllidi | Erik Velldal | Fred Philippy | Giedre Valunaite Oleskeviciene | Ieva Rizgeliene | Inguna Skadina | Irina Lobzhanidze | Isabell Stinessen Haugen | Jauza Akbar Krito | Jelena M. Marković | Johanna Monti | Josue Alejandro Sauca | Kaja Dobrovoljc Zor | Kingsley O. Ugwuanyi | Laura Rituma | Lilja Øvrelid | Maha Tufail Agro | Manzura Abjalova | Maria Chatzigrigoriou | María del Mar Sánchez Ramos | Marija Pendevska | Masoumeh Seyyedrezaei | Mehrnoush Shamsfard | Momina Ahsan | Muhammad Ahsan Riaz Khan | Nathalie Carmen Hau Norman | Nilay Erdem Ayyıldız | Nina Hosseini-Kivanani | Noémi Ligeti-Nagy | Numaan Naeem | Olha Kanishcheva | Olha Yatsyshyna | Daniil Orel | Petra Giommarelli | Petya Osenova | Radovan Garabik | Regina E. Semou | Rozane Rebechi | Salsabila Zahirah Pranida | Samia Touileb | Sanni Nimb | Sarfraz Ahmad | Sarvinoz Sharipova | Shahar Golan | Shaoxiong Ji | Sopuruchi Christian Aboh | Srdjan Sucur | Stella Markantonatou | Sussi Olsen | Vahide Tajalli | Veronika Lipp | Voula Giouli | Yelda Yeşildal Eraydın | Zahra Saaberi | Zhuohan Xie
Proceedings of the Fifteenth Language Resources and Evaluation Conference

Potentially idiomatic expressions (PIEs) carry meanings inherently tied to the everyday experience of a given language community. As such, they constitute an interesting challenge for assessing the linguistic (and to some extent cultural) capabilities of NLP systems. In this paper, we present XMPIE, a parallel multilingual and multimodal dataset of potentially idiomatic expressions. The dataset, containing 34 languages and over ten thousand items, allows comparative analyses of idiomatic patterns among language-specific realisations and preferences in order to gather insights about shared cultural aspects. This parallel dataset allows evaluation of language model performance for a given PIE in different languages and whether idiomatic understanding in one language can be transferred to another. Moreover, the dataset supports the study of PIEs across textual and visual modalities, to measure to what extent PIE understanding in one modality transfers or implies in understanding in another modality (text vs. image). The data was created by language experts, with both textual and visual components crafted under multilingual guidelines, and each PIE is accompanied by five images representing a spectrum from idiomatic to literal meanings, including semantically related and random distractors. The result is a high-quality benchmark for evaluating multilingual and multimodal idiomatic language understanding.

2025

Advancing Persian LLM Evaluation
Sara Bourbour Hosseinbeigi | Behnam Rohani | Mostafa Masoudi | Mehrnoush Shamsfard | Zahra Saaberi | Mostafa Karimi Manesh | Mohammad Amin Abbasi
Findings of the Association for Computational Linguistics: NAACL 2025

Evaluation of large language models (LLMs) in low-resource languages like Persian has received less attention than in high-resource languages like English. Existing evaluation approaches for Persian LLMs generally lack comprehensive frameworks, limiting their ability to assess models’ performance over a wide range of tasks requiring considerable cultural and contextual knowledge, as well as a deeper understanding of Persian literature and style. This paper first aims to fill this gap by providing two new benchmarks, PeKA and PK-BETS, on topics such as history, literature, and cultural knowledge, as well as challenging the present state-of-the-art models’ abilities in a variety of Persian language comprehension tasks. These datasets are meant to reduce data contamination while providing an accurate assessment of Persian LLMs. The second aim of this paper is the general evaluation of LLMs across the current Persian benchmarks to provide a comprehensive performance overview. By offering a structured evaluation methodology, we hope to promote the examination of LLMs in the Persian language.

Co-authors

Maha Tufail Agro 1

Sarfraz Ahmad 1

Dina Almassova 1

Doğukan Arslan 1

Maria Chatzigrigoriou 1

Kaja Dobrovoljc 1

Nilay Erdem Ayyıldız 1

Doruk Eryiğit 1

Gülşen Eryiğit 1

Radovan Garabik 1

Petra Giommarelli 1

Isabell Stinessen Haugen 1

Carlos Manuel Hidalgo-Ternero 1

Sara Bourbour Hosseinbeigi 1

Nina Hosseini-Kivanani 1

Anna Kanellopoulou 1

Olha Kanishcheva 1

Muhammad Ahsan Riaz Khan 1

Jauza Akbar Krito 1

Alesia Lazarenka 1

Chaya Liebeskind 1

Noémi Ligeti-Nagy 1

Veronika Lipp 1

Irina Lobzhanidze 1

Mostafa Karimi Manesh 1

Stella Markantonatou 1

Jelena M. Marković 1

Mostafa Masoudi 1

Amália Mendes 1

Johanna Monti 1

Nathalie Carmen Hau Norman 1

Petya Osenova 1

Adriana Silvina Pagano 1

Bolette Sandford Pedersen 1

Marija Pendevska 1

Fred Philippy 1

Thomas Pickard 1

Salsabila Zahirah Pranida 1

María Del Mar Sánchez Ramos 1

Rozane Rebechi 1

Ieva Rizgeliene 1

Antoni Brosa Rodríguez 1

Behnam Rohani 1

Josue Alejandro Sauca 1

Regina E. Semou 1

Masoumeh Seyyedrezaei 1

Sarvinoz Sharipova 1

Inguna Skadina 1

Vahide Tajalli 1

Dilara Torunoğlu-Selamet 1

Samia Touileb 1

Eleni Triantafyllidi 1

Kingsley O. Ugwuanyi 1

Baiba Valkovska 1

Giedre Valunaite Oleskeviciene 1

Aline Villavicencio 1

Rodrigo Wilkens 1

Beata Wójtowicz 1

Olha Yatsyshyna 1

Yelda Yeşildal Eraydın 1

Lilja Øvrelid 1

Venues

Findings1
LREC1