Diandra Fabre
2026
Building a Dataset for French Accent Classification Evaluation: Are We There Yet?
Diandra Fabre | Mathieu Avanzi | François Portet
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Diandra Fabre | Mathieu Avanzi | François Portet
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Current evaluation practices in speech processing systems often overlook the diversity of spoken accents, leading to significant performance disparities across speaker groups. This issue largely comes from biases and imbalances in training corpora, and is further compounded by the scarcity of open-source datasets suitable for evaluating accent variability in French. To address this gap, we extend the CFPR dataset with explicit accent labels, providing a new benchmark for assessing the robustness of speech technology systems across diverse French accents. We additionally conduct a perceptual study with 87 human participants to evaluate the reliability and interpretability of these labels. Using this resource, we evaluated an eight-class French accent classifier trained on Common Voice data. The first results highlight both the complexity of automatic French accent recognition in low-resource settings, and the difficulty for French-speakers to perceive all the linguistic variabilities in French-speaking countries.
BenCSSmark: Making the Social Sciences Count in LLM Research
Arnault Chatelain | Etienne Ollion | Qianwen Guan | Diandra Fabre | Lorraine Goeuriot | Emile Chapuis | Abdelkrim Beloued | Marie Candito | Nicolas Hervé | Didier Schwab
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Arnault Chatelain | Etienne Ollion | Qianwen Guan | Diandra Fabre | Lorraine Goeuriot | Emile Chapuis | Abdelkrim Beloued | Marie Candito | Nicolas Hervé | Didier Schwab
Proceedings of the Fifteenth Language Resources and Evaluation Conference
This position paper argues that the under-representation of social science tasks in contemporary LLM benchmarks limits advances in both LLM evaluation and social scientific inquiry. Benchmarks — standardized tools for assessing computational systems — are pivotal in the development of artificial intelligence (AI), including large language models (LLMs). Benchmarks do more than measure progress — they actively structure it, shaping reputations, research agendas, and commercial outcomes. Despite this central role, the social sciences are largely absent from mainstream evaluation frameworks, even though scholars in these fields generate dozens of rigorously annotated, context-sensitive datasets each year. Integrating this work into benchmark design could significantly improve the generalization and robustness of AI models. In turn, models trained on social scientific tasks would likely yield better performance on classic and contemporary tasks in disciplines as diverse as history, sociology, political science or economics. This is all the more pressing as these disciplines are quickly turning to LLMs for assistance. To address this gap, we introduce BenCSSmark, a benchmark composed of datasets annotated by computational social scientists. By integrating social scientific perspectives into benchmarking, BenCSSmark seeks to promote more robust, transparent, and socially relevant AI systems and to foster efficient collaboration.
Pantagruel: Unified Self-Supervised Encoders for French Text and Speech
Phuong-Hang Le | Valentin Pelloin | Arnault Chatelain | Maryem Bouziane | Mohammed Ghennai | Qianwen Guan | Kirill Milintsevich | Salima Mdhaffar | Aidan Mannion | Nils Defauw | Shuyue Gu | Alexandre Daniel Audibert | Marco Dinarelli | Yannick Estève | Lorraine Goeuriot | Steffen Lalande | Nicolas Hervé | Maximin Coavoux | François Portet | Étienne Ollion | Marie Candito | Maxime Peyrard | Solange Rossato | Benjamin Lecouteux | Aurélie Nardy | Gilles Sérasset | Vincent Segonne | Solène Evain | Diandra Fabre | Didier Schwab
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Phuong-Hang Le | Valentin Pelloin | Arnault Chatelain | Maryem Bouziane | Mohammed Ghennai | Qianwen Guan | Kirill Milintsevich | Salima Mdhaffar | Aidan Mannion | Nils Defauw | Shuyue Gu | Alexandre Daniel Audibert | Marco Dinarelli | Yannick Estève | Lorraine Goeuriot | Steffen Lalande | Nicolas Hervé | Maximin Coavoux | François Portet | Étienne Ollion | Marie Candito | Maxime Peyrard | Solange Rossato | Benjamin Lecouteux | Aurélie Nardy | Gilles Sérasset | Vincent Segonne | Solène Evain | Diandra Fabre | Didier Schwab
Proceedings of the Fifteenth Language Resources and Evaluation Conference
We release Pantagruel models, a new family of self-supervised encoder models for French text and speech. Instead of predicting modality-tailored targets such as textual tokens or speech units, Pantagruel learns contextualized target representations in the feature space, allowing modality-specific encoders to capture linguistic and acoustic regularities more effectively. Separate models are pre-trained on large-scale French corpora, including Wikipedia, OSCAR and CroissantLLM for text, together with MultilingualLibriSpeech, LeBenchmark, and INA-100k for speech. INA-100k is a newly introduced 100,000-hour corpus of French audio derived from the archives of the Institut National de l’Audiovisuel (INA), the national repository of French radio and television broadcasts, providing highly diverse audio data. We evaluate Pantagruel across a broad range of downstream tasks spanning both modalities, including those from the standard French benchmarks such as FLUE or LeBenchmark. Across these tasks, Pantagruel models show competitive or superior performance compared to strong French baselines such as CamemBERT, FlauBERT, and LeBenchmark2.0, while maintaining a shared architecture that can seamlessly handle either speech or text inputs. These results confirm the effectiveness of feature-space self-supervised objectives for French representation learning and highlight Pantagruel as a robust foundation for multimodal speech-text understanding.
2025
Corpus bilingue sous-titrage et Langue des Signes Française : la problématique de l’alignement automatique des données
Julie Halbout | Diandra Fabre
Actes des 18e Rencontres Jeunes Chercheurs en RI (RJCRI) et 27ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL)
Julie Halbout | Diandra Fabre
Actes des 18e Rencontres Jeunes Chercheurs en RI (RJCRI) et 27ème Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL)
Dans cet article, nous présentons une étude sur la problématique de l’alignement automatique des données dans un corpus constitué de discours en français parlé, sous-titrés en français écrit et interprétés en langue des signes française (LSF). Après une introduction précisant le processus bien particulier de l’interprétation en langue des signes, nous dressons un tour d’horizon des ensembles de données existants pour la LSF ainsi que les spécificités du corpus Matignon-LSF, constitué à partir des comptes-rendus vidéos hebdomadaires du conseil des ministres. Nous montrons ensuite sur quelques exemples certains des phénomènes observés sur la problématique de l’alignement temporel entre les sous-titres synchronisés avec l’audio, et la LSF interprétée qui subit un décalage temporel. Nous en concluons que le niveau d’alignement ne peut pas être celui des phrases en français écrit et proposons quelques pistes pour la suite.
SuperGPQA-HCE-FR : un corpus spécialisé en français pour le domaine hydraulique et le génie civil
Markarit Vartampetian | Diandra Fabre | Philippe Mulhem | Sylvain Joubert | Didier Schwab
Actes de l'atelier Évaluation des modèles génératifs (LLM) et challenge 2025 (EvalLLM)
Markarit Vartampetian | Diandra Fabre | Philippe Mulhem | Sylvain Joubert | Didier Schwab
Actes de l'atelier Évaluation des modèles génératifs (LLM) et challenge 2025 (EvalLLM)
Dans cet article, nous présentons SuperGPQA-HCE-FR, une adaptation française d’un sous-ensemble du benchmark SuperGPQA axé sur les domaines de l’ingénierie hydraulique et du génie civil. Il comprend 285 questions à choix multiples conçues pour évaluer et spécialiser des modèles de langue multilingues de grande taille (LLMs) sur des tâches techniques. La traduction réalisée automatiquement est ensuite évaluée par des experts des domaines. Enfin, nous présentons les premiers résultats sur des modèles Instruct généralistes multilingues en comparant les performances du corpus original en anglais à celles du corpus traduit en français.
2024
Matignon-LSF: a Large Corpus of Interpreted French Sign Language
Julie Halbout | Diandra Fabre | Yanis Ouakrim | Julie Lascar | Annelies Braffort | Michèle Gouiffès | Denis Beautemps
Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources
Julie Halbout | Diandra Fabre | Yanis Ouakrim | Julie Lascar | Annelies Braffort | Michèle Gouiffès | Denis Beautemps
Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources
Search
Fix author
Co-authors
- Didier Schwab 3
- Marie Candito 2
- Arnault Chatelain 2
- Lorraine Goeuriot 2
- Qianwen Guan 2
- Julie Halbout 2
- Nicolas Hervé 2
- Etienne Ollion 2
- François Portet 2
- Alexandre Daniel Audibert 1
- Mathieu Avanzi 1
- Denis Beautemps 1
- Abdelkrim Beloued 1
- Maryem Bouziane 1
- Annelies Braffort 1
- Emile Chapuis 1
- Maximin Coavoux 1
- Nils Defauw 1
- Marco Dinarelli 1
- Yannick Estève 1
- Solène Evain 1
- Mohammed Ghennai 1
- Michèle Gouiffès 1
- Shuyue Gu 1
- Sylvain Joubert 1
- Steffen Lalande 1
- Julie Lascar 1
- Phuong-Hang Le 1
- Benjamin Lecouteux 1
- Aidan Mannion 1
- Salima Mdhaffar 1
- Kirill Milintsevich 1
- Philippe Mulhem 1
- Aurélie Nardy 1
- Yanis Ouakrim 1
- Valentin Pelloin 1
- Maxime Peyrard 1
- Solange Rossato 1
- Vincent Segonne 1
- Gilles Sérasset 1
- Markarit Vartampetian 1