Badreddine Noune
2023
AlGhafa Evaluation Benchmark for Arabic Language Models
Ebtesam Almazrouei
|
Ruxandra Cojocaru
|
Michele Baldo
|
Quentin Malartic
|
Hamza Alobeidli
|
Daniele Mazzotta
|
Guilherme Penedo
|
Giulia Campesan
|
Mugariya Farooq
|
Maitha Alhammadi
|
Julien Launay
|
Badreddine Noune
Proceedings of ArabicNLP 2023
Recent advances in the space of Arabic large language models have opened up a wealth of potential practical applications. From optimal training strategies, large scale data acquisition and continuously increasing NLP resources, the Arabic LLM landscape has improved in a very short span of time, despite being plagued by training data scarcity and limited evaluation resources compared to English. In line with contributing towards this ever-growing field, we introduce AlGhafa, a new multiple-choice evaluation benchmark for Arabic LLMs. For showcasing purposes, we train a new suite of models, including a 14 billion parameter model, the largest monolingual Arabic decoder-only model to date. We use a collection of publicly available datasets, as well as a newly introduced HandMade dataset consisting of 8 billion tokens. Finally, we explore the quantitative and qualitative toxicity of several Arabic models, comparing our models to existing public Arabic LLMs.
Search
Co-authors
- Ebtesam Almazrouei 1
- Ruxandra Cojocaru 1
- Michele Baldo 1
- Quentin Malartic 1
- Hamza Alobeidli 1
- show all...