Building a Dataset for French Accent Classification Evaluation: Are We There Yet?

Diandra Fabre, Mathieu Avanzi, François Portet


Abstract
Current evaluation practices in speech processing systems often overlook the diversity of spoken accents, leading to significant performance disparities across speaker groups. This issue largely comes from biases and imbalances in training corpora, and is further compounded by the scarcity of open-source datasets suitable for evaluating accent variability in French. To address this gap, we extend the CFPR dataset with explicit accent labels, providing a new benchmark for assessing the robustness of speech technology systems across diverse French accents. We additionally conduct a perceptual study with 87 human participants to evaluate the reliability and interpretability of these labels. Using this resource, we evaluated an eight-class French accent classifier trained on Common Voice data. The first results highlight both the complexity of automatic French accent recognition in low-resource settings, and the difficulty for French-speakers to perceive all the linguistic variabilities in French-speaking countries.
Anthology ID:
2026.lrec-main.450
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
5711–5721
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.450/
DOI:
Bibkey:
Cite (ACL):
Diandra Fabre, Mathieu Avanzi, and François Portet. 2026. Building a Dataset for French Accent Classification Evaluation: Are We There Yet?. International Conference on Language Resources and Evaluation, main:5711–5721.
Cite (Informal):
Building a Dataset for French Accent Classification Evaluation: Are We There Yet? (Fabre et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.450.pdf