Alejandro Vaca Serrano
2025
La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America
María Grandury | Javier Aula-Blasco | Júlia Falcão | Clémentine Fourrier | Miguel González Saiz | Gonzalo Martínez | Gonzalo Santamaria Gomez | Rodrigo Agerri | Nuria Aldama García | Luis Chiruzzo | Javier Conde | Helena Gomez Adorno | Marta Guerrero Nieto | Guido Ivetta | Natàlia López Fuertes | Flor Miriam Plaza-del-Arco | María-Teresa Martín-Valdivia | Helena Montoro Zamorano | Carmen Muñoz Sanz | Pedro Reviriego | Leire Rosado Plaza | Alejandro Vaca Serrano | Estrella Vallecillo-Rodríguez | Jorge Vallego | Irune Zubiaga
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
María Grandury | Javier Aula-Blasco | Júlia Falcão | Clémentine Fourrier | Miguel González Saiz | Gonzalo Martínez | Gonzalo Santamaria Gomez | Rodrigo Agerri | Nuria Aldama García | Luis Chiruzzo | Javier Conde | Helena Gomez Adorno | Marta Guerrero Nieto | Guido Ivetta | Natàlia López Fuertes | Flor Miriam Plaza-del-Arco | María-Teresa Martín-Valdivia | Helena Montoro Zamorano | Carmen Muñoz Sanz | Pedro Reviriego | Leire Rosado Plaza | Alejandro Vaca Serrano | Estrella Vallecillo-Rodríguez | Jorge Vallego | Irune Zubiaga
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Leaderboards showcase the current capabilities and limitations of Large Language Models (LLMs). To motivate the development of LLMs that represent the linguistic and cultural diversity of the Spanish-speaking community, we present La Leaderboard, the first open-source leaderboard to evaluate generative LLMs in languages and language varieties of Spain and Latin America. La Leaderboard is a community-driven project that aims to establish an evaluation standard for everyone interested in developing LLMs for the Spanish-speaking community. This initial version combines 66 datasets in Catalan, Basque, Galician, and different Spanish varieties, showcasing the evaluation results of 50 models. To encourage community-driven development of leaderboards in other languages, we explain our methodology, including guidance on selecting the most suitable evaluation setup for each downstream task. In particular, we provide a rationale for using fewer few-shot examples than typically found in the literature, aiming to reduce environmental impact and facilitate access to reproducible results for a broader research community.
Search
Fix author
Co-authors
- Rodrigo Agerri 1
- Javier Aula-Blasco 1
- Luis Chiruzzo 1
- Javier Conde 1
- Júlia Falcão 1
- Clémentine Fourrier 1
- Natàlia López Fuertes 1
- Nuria Aldama García 1
- Gonzalo Santamaria Gomez 1
- Helena Gomez Adorno 1
- María Grandury 1
- Guido Ivetta 1
- María-Teresa Martín-Valdivia 1
- Gonzalo Martínez 1
- Marta Guerrero Nieto 1
- Leire Rosado Plaza 1
- Flor Miriam Plaza-del-Arco 1
- Pedro Reviriego 1
- Miguel González Saiz 1
- Carmen Muñoz Sanz 1
- Estrella Vallecillo-Rodríguez 1
- Jorge Vallego 1
- Helena Montoro Zamorano 1
- Irune Zubiaga 1
Venues
- acl1