Rachida Kebdani

2025

pdf bib abs
Evaluating Calibration of Arabic Pre-trained Language Models on Dialectal Text
Ali Al-Laith | Rachida Kebdani
Proceedings of the 4th Workshop on Arabic Corpus Linguistics (WACL-4)

While pre-trained language models have made significant progress in different classification tasks, little attention has been given to the reliability of their confidence scores. Calibration, how well model confidence aligns with actual accuracy, is essential for real-world applications where decisions rely on probabilistic outputs. This study addresses this gap in Arabic dialect identification by assessing the calibration of eight pre-trained language models, ensuring their predictions are not only accurate but also reliable for practical applications. We analyze two datasets: one with over 1 million text samples and the Nuanced Arabic Dialect Identification dataset(NADI-2023). Using Expected Calibration Error (ECE) as a metric, we reveal substantial variation in model calibration across dialects in both datasets, showing that prediction confidence can vary significantly depending on regional data. This research has implications for improving the reliability of Arabic dialect models in applications like sentiment analysis and social media monitoring.

Co-authors

Ali Al-Laith 1

Venues

wacl1
ws1

Fix data

Rachida Kebdani

Fixing paper assignments

2025

Co-authors

Venues