Veronika Lipp
2026
A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding
Dilara Torunoğlu-Selamet | Doğukan Arslan | Rodrigo Wilkens | Wei He | Doruk Eryiğit | Thomas Pickard | Adriana S. Pagano | Aline Villavicencio | Gülşen Eryiğit | Ágnes Abuczki | Aida Cardoso | Alesia Lazarenka | Dina Almassova | Amália Mendes | Anna Kanellopoulou | Antoni Brosa-Rodriguez | Baiba Valkovska | Beata Wojtowicz | Bolette Pedersen | Carlos Manuel Hidalgo-Ternero | Chaya Liebeskind | Danka Jokić | Diego Alves | Eleni Triantafyllidi | Erik Velldal | Fred Philippy | Giedre Valunaite Oleskeviciene | Ieva Rizgeliene | Inguna Skadina | Irina Lobzhanidze | Isabell Stinessen Haugen | Jauza Akbar Krito | Jelena M. Marković | Johanna Monti | Josue Alejandro Sauca | Kaja Dobrovoljc Zor | Kingsley O. Ugwuanyi | Laura Rituma | Lilja Øvrelid | Maha Tufail Agro | Manzura Abjalova | Maria Chatzigrigoriou | María del Mar Sánchez Ramos | Marija Pendevska | Masoumeh Seyyedrezaei | Mehrnoush Shamsfard | Momina Ahsan | Muhammad Ahsan Riaz Khan | Nathalie Carmen Hau Norman | Nilay Erdem Ayyıldız | Nina Hosseini-Kivanani | Noémi Ligeti-Nagy | Numaan Naeem | Olha Kanishcheva | Olha Yatsyshyna | Daniil Orel | Petra Giommarelli | Petya Osenova | Radovan Garabik | Regina E. Semou | Rozane Rebechi | Salsabila Zahirah Pranida | Samia Touileb | Sanni Nimb | Sarfraz Ahmad | Sarvinoz Sharipova | Shahar Golan | Shaoxiong Ji | Sopuruchi Christian Aboh | Srdjan Sucur | Stella Markantonatou | Sussi Olsen | Vahide Tajalli | Veronika Lipp | Voula Giouli | Yelda Yeşildal Eraydın | Zahra Saaberi | Zhuohan Xie
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Dilara Torunoğlu-Selamet | Doğukan Arslan | Rodrigo Wilkens | Wei He | Doruk Eryiğit | Thomas Pickard | Adriana S. Pagano | Aline Villavicencio | Gülşen Eryiğit | Ágnes Abuczki | Aida Cardoso | Alesia Lazarenka | Dina Almassova | Amália Mendes | Anna Kanellopoulou | Antoni Brosa-Rodriguez | Baiba Valkovska | Beata Wojtowicz | Bolette Pedersen | Carlos Manuel Hidalgo-Ternero | Chaya Liebeskind | Danka Jokić | Diego Alves | Eleni Triantafyllidi | Erik Velldal | Fred Philippy | Giedre Valunaite Oleskeviciene | Ieva Rizgeliene | Inguna Skadina | Irina Lobzhanidze | Isabell Stinessen Haugen | Jauza Akbar Krito | Jelena M. Marković | Johanna Monti | Josue Alejandro Sauca | Kaja Dobrovoljc Zor | Kingsley O. Ugwuanyi | Laura Rituma | Lilja Øvrelid | Maha Tufail Agro | Manzura Abjalova | Maria Chatzigrigoriou | María del Mar Sánchez Ramos | Marija Pendevska | Masoumeh Seyyedrezaei | Mehrnoush Shamsfard | Momina Ahsan | Muhammad Ahsan Riaz Khan | Nathalie Carmen Hau Norman | Nilay Erdem Ayyıldız | Nina Hosseini-Kivanani | Noémi Ligeti-Nagy | Numaan Naeem | Olha Kanishcheva | Olha Yatsyshyna | Daniil Orel | Petra Giommarelli | Petya Osenova | Radovan Garabik | Regina E. Semou | Rozane Rebechi | Salsabila Zahirah Pranida | Samia Touileb | Sanni Nimb | Sarfraz Ahmad | Sarvinoz Sharipova | Shahar Golan | Shaoxiong Ji | Sopuruchi Christian Aboh | Srdjan Sucur | Stella Markantonatou | Sussi Olsen | Vahide Tajalli | Veronika Lipp | Voula Giouli | Yelda Yeşildal Eraydın | Zahra Saaberi | Zhuohan Xie
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Potentially idiomatic expressions (PIEs) carry meanings inherently tied to the everyday experience of a given language community. As such, they constitute an interesting challenge for assessing the linguistic (and to some extent cultural) capabilities of NLP systems. In this paper, we present XMPIE, a parallel multilingual and multimodal dataset of potentially idiomatic expressions. The dataset, containing 34 languages and over ten thousand items, allows comparative analyses of idiomatic patterns among language-specific realisations and preferences in order to gather insights about shared cultural aspects. This parallel dataset allows evaluation of language model performance for a given PIE in different languages and whether idiomatic understanding in one language can be transferred to another. Moreover, the dataset supports the study of PIEs across textual and visual modalities, to measure to what extent PIE understanding in one modality transfers or implies in understanding in another modality (text vs. image). The data was created by language experts, with both textual and visual components crafted under multilingual guidelines, and each PIE is accompanied by five images representing a spectrum from idiomatic to literal meanings, including semantically related and random distractors. The result is a high-quality benchmark for evaluating multilingual and multimodal idiomatic language understanding.
2023
XL-WA: a Gold Evaluation Benchmark for Word Alignment in 14 Language Pairs
Federico Martelli | Andrei Stefan Bejgu | Cesare Campagnano | Jaka Čibej | Rute Costa | Apolonija Gantar | Jelena Kallas | Svetla Peneva Koeva | Kristina Koppel | Simon Krek | Margit Langemets | Veronika Lipp | Sanni Nimb | Sussi Olsen | Bolette Sanford Pedersen | Valeria Quochi | Ana Salgado | László Simon | Carole Tiberius | Rafael-J Ureña-Ruiz | Roberto Navigli
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)
Federico Martelli | Andrei Stefan Bejgu | Cesare Campagnano | Jaka Čibej | Rute Costa | Apolonija Gantar | Jelena Kallas | Svetla Peneva Koeva | Kristina Koppel | Simon Krek | Margit Langemets | Veronika Lipp | Sanni Nimb | Sussi Olsen | Bolette Sanford Pedersen | Valeria Quochi | Ana Salgado | László Simon | Carole Tiberius | Rafael-J Ureña-Ruiz | Roberto Navigli
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)
2020
A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
Sina Ahmadi | John P. McCrae | Sanni Nimb | Fahad Khan | Monica Monachini | Bolette S. Pedersen | Thierry Declerck | Tanja Wissik | Andrea Bellandi | Irene Pisani | Thomas Troelsgård | Sussi Olsen | Simon Krek | Veronika Lipp | Tamás Váradi | László Simon | András Győrffy | Carole Tiberius | Tanneke Schoonheim | Yifat Ben Moshe | Maya Rudich | Raya Abu Ahmad | Dorielle Lonke | Kira Kovalenko | Margit Langemets | Jelena Kallas | Oksana Dereza | Theodorus Fransen | David Cillessen | David Lindemann | Mikel Alonso | Ana Salgado | José Luis Sancho | Rafael-J. Ureña-Ruiz | Jordi Porta Zamorano | Kiril Simov | Petya Osenova | Zara Kancheva | Ivaylo Radev | Ranka Stanković | Andrej Perdih | Dejan Gabrovšek
Proceedings of the Twelfth Language Resources and Evaluation Conference
Sina Ahmadi | John P. McCrae | Sanni Nimb | Fahad Khan | Monica Monachini | Bolette S. Pedersen | Thierry Declerck | Tanja Wissik | Andrea Bellandi | Irene Pisani | Thomas Troelsgård | Sussi Olsen | Simon Krek | Veronika Lipp | Tamás Váradi | László Simon | András Győrffy | Carole Tiberius | Tanneke Schoonheim | Yifat Ben Moshe | Maya Rudich | Raya Abu Ahmad | Dorielle Lonke | Kira Kovalenko | Margit Langemets | Jelena Kallas | Oksana Dereza | Theodorus Fransen | David Cillessen | David Lindemann | Mikel Alonso | Ana Salgado | José Luis Sancho | Rafael-J. Ureña-Ruiz | Jordi Porta Zamorano | Kiril Simov | Petya Osenova | Zara Kancheva | Ivaylo Radev | Ranka Stanković | Andrej Perdih | Dejan Gabrovšek
Proceedings of the Twelfth Language Resources and Evaluation Conference
Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually aligning monolingual dictionaries. The alignment is carried out at sense-level for various resources in 15 languages. Moreover, senses are annotated with possible semantic relationships such as broadness, narrowness, relatedness, and equivalence. In comparison to previous datasets for this task, this dataset covers a wide range of languages and resources and focuses on the more challenging task of linking general-purpose language. We believe that our data will pave the way for further advances in alignment and evaluation of word senses by creating new solutions, particularly those notoriously requiring data such as neural networks. Our resources are publicly available at https://github.com/elexis-eu/MWSA.
Search
Fix author
Co-authors
- Sanni Nimb 3
- Sussi Olsen 3
- Jelena Kallas 2
- Simon Krek 2
- Margit Langemets 2
- Petya Osenova 2
- Bolette Sandford Pedersen 2
- Ana Salgado 2
- László Simon 2
- Carole Tiberius 2
- Rafael-J. Ureña-Ruiz 2
- Manzura Abjalova 1
- Sopuruchi Christian Aboh 1
- Raya Abu Ahmad 1
- Ágnes Abuczki 1
- Maha Tufail Agro 1
- Sarfraz Ahmad 1
- Sina Ahmadi 1
- Momina Ahsan 1
- Dina Almassova 1
- Mikel Alonso 1
- Diego Alves 1
- Doğukan Arslan 1
- Andrei Stefan Bejgu 1
- Andrea Bellandi 1
- Yifat Ben Moshe 1
- Cesare Campagnano 1
- Aida Cardoso 1
- Maria Chatzigrigoriou 1
- David Cillessen 1
- Rute Costa 1
- Thierry Declerck 1
- Oksana Dereza 1
- Kaja Dobrovoljc 1
- Nilay Erdem Ayyıldız 1
- Doruk Eryiğit 1
- Gülşen Eryiğit 1
- Theodorus Fransen 1
- Dejan Gabrovšek 1
- Apolonija Gantar 1
- Radovan Garabik 1
- Petra Giommarelli 1
- Voula Giouli 1
- Shahar Golan 1
- András Győrffy 1
- Isabell Stinessen Haugen 1
- Wei He 1
- Carlos Manuel Hidalgo-Ternero 1
- Nina Hosseini-Kivanani 1
- Shaoxiong Ji 1
- Danka Jokić 1
- Zara Kancheva 1
- Anna Kanellopoulou 1
- Olha Kanishcheva 1
- Fahad Khan 1
- Muhammad Ahsan Riaz Khan 1
- Svetla Peneva Koeva 1
- Kristina Koppel 1
- Kira Kovalenko 1
- Jauza Akbar Krito 1
- Alesia Lazarenka 1
- Chaya Liebeskind 1
- Noémi Ligeti-Nagy 1
- David Lindemann 1
- Irina Lobzhanidze 1
- Dorielle Lonke 1
- Stella Markantonatou 1
- Jelena M. Marković 1
- Federico Martelli 1
- John Philip McCrae 1
- Amália Mendes 1
- Monica Monachini 1
- Johanna Monti 1
- Numaan Naeem 1
- Roberto Navigli 1
- Nathalie Carmen Hau Norman 1
- Daniil Orel 1
- Adriana Silvina Pagano 1
- Marija Pendevska 1
- Andrej Perdih 1
- Fred Philippy 1
- Thomas Pickard 1
- Irene Pisani 1
- Salsabila Zahirah Pranida 1
- Valeria Quochi 1
- Ivaylo Radev 1
- María Del Mar Sánchez Ramos 1
- Rozane Rebechi 1
- Laura Rituma 1
- Ieva Rizgeliene 1
- Antoni Brosa Rodríguez 1
- Maya Rudich 1
- Zahra Saaberi 1
- José-Luis Sancho 1
- Bolette Sanford Pedersen 1
- Josue Alejandro Sauca 1
- Tanneke Schoonheim 1
- Regina E. Semou 1
- Masoumeh Seyyedrezaei 1
- Mehrnoush Shamsfard 1
- Sarvinoz Sharipova 1
- Kiril Simov 1
- Inguna Skadina 1
- Ranka Stanković 1
- Srdjan Sucur 1
- Vahide Tajalli 1
- Dilara Torunoğlu-Selamet 1
- Samia Touileb 1
- Eleni Triantafyllidi 1
- Thomas Troelsgård 1
- Kingsley O. Ugwuanyi 1
- Baiba Valkovska 1
- Giedre Valunaite Oleskeviciene 1
- Erik Velldal 1
- Aline Villavicencio 1
- Tamás Váradi 1
- Rodrigo Wilkens 1
- Tanja Wissik 1
- Beata Wójtowicz 1
- Zhuohan Xie 1
- Olha Yatsyshyna 1
- Yelda Yeşildal Eraydın 1
- Jordi Porta Zamorano 1
- Lilja Øvrelid 1
- Jaka Čibej 1