Pedro Reviriego


2025

pdf bib
La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America
María Grandury | Javier Aula-Blasco | Júlia Falcão | Clémentine Fourrier | Miguel González Saiz | Gonzalo Martínez | Gonzalo Santamaria Gomez | Rodrigo Agerri | Nuria Aldama García | Luis Chiruzzo | Javier Conde | Helena Gomez Adorno | Marta Guerrero Nieto | Guido Ivetta | Natàlia López Fuertes | Flor Miriam Plaza-del-Arco | María-Teresa Martín-Valdivia | Helena Montoro Zamorano | Carmen Muñoz Sanz | Pedro Reviriego | Leire Rosado Plaza | Alejandro Vaca Serrano | Estrella Vallecillo-Rodríguez | Jorge Vallego | Irune Zubiaga
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Leaderboards showcase the current capabilities and limitations of Large Language Models (LLMs). To motivate the development of LLMs that represent the linguistic and cultural diversity of the Spanish-speaking community, we present La Leaderboard, the first open-source leaderboard to evaluate generative LLMs in languages and language varieties of Spain and Latin America. La Leaderboard is a community-driven project that aims to establish an evaluation standard for everyone interested in developing LLMs for the Spanish-speaking community. This initial version combines 66 datasets in Catalan, Basque, Galician, and different Spanish varieties, showcasing the evaluation results of 50 models. To encourage community-driven development of leaderboards in other languages, we explain our methodology, including guidance on selecting the most suitable evaluation setup for each downstream task. In particular, we provide a rationale for using fewer few-shot examples than typically found in the literature, aiming to reduce environmental impact and facilitate access to reproducible results for a broader research community.