Léo Labat
2026
Polyglots or Multitudes? Multilingual LLM Answers to Value-laden Multiple-Choice Questions
Léo Labat | Etienne Ollion | François Yvon
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Léo Labat | Etienne Ollion | François Yvon
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Multiple-Choice Questions (MCQs) are often used to assess knowledge, reasoning abilities, and even values encoded in large language models (LLMs). While the effect of multilingualism has been studied on LLM factual recall, this paper seeks to investigate the less explored question of language-induced variation in value-laden MCQ responses. Are multilingual LLMs consistent in their responses across languages, i.e. behave like theoretical polyglots, or do they answer value-laden MCQs depending on the language of the question, like a multitude of monolingual models expressing different values through a single model? We release a new corpus, the Multilingual European Value Survey (MEVS), which, unlike prior work relying on machine translation or ad hoc prompts, solely comprises human-translated survey questions aligned in 8 European languages. We administer a subset of those questions to over thirty multilingual LLMs of various sizes, manufacturers and alignment-fine-tuning status under comprehensive, controlled prompt variations including answer order, symbol type, and tail character. Our results show that while larger, instruction-tuned models display higher overall consistency, the robustness of their responses varies greatly across questions, with certain MCQs eliciting total agreement within and across models while others leave LLM answers split. Language-specific behavior seems to arise in all consistent, instruction-fine-tuned models, but only on certain questions, warranting a further study of the selective effect of preference fine-tuning.
The GDN-CC Dataset: Automatic Corpus Clarification for AI-enhanced Democratic Citizen Consultations
Pierre-Antoine Lequeu | Léo Labat | Laurène Cave | Gaël Lejeune | François Yvon | Benjamin Piwowarski
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Pierre-Antoine Lequeu | Léo Labat | Laurène Cave | Gaël Lejeune | François Yvon | Benjamin Piwowarski
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
LLMs are ubiquitous in modern NLP, and while their applicability extends to texts produced for democratic activities such as online deliberations or large-scale citizen consultations, ethical questions have been raised for their usage as analysis tools. We continue this line of research with two main goals: (a) to develop resources that can help standardize citizen contributions in public forums at the pragmatic level, and make them easier to use in topic modeling and political analysis; (b) to study how well this standardization can reliably be performed by small, open-weights LLMs, i.e. models that can be run locally and transparently with limited resources. Accordingly, we introduce Corpus Clarification as a preprocessing framework for large-scale consultation data that transforms noisy, multi-topic contributions into structured, self-contained argumentative units ready for downstream analysis. We present GDN-CC, a manually-curated dataset of 1,231 contributions to the French Grand Débat National, comprising 2,285 argumentative units annotated for argumentative structure and manually clarified. We then show that finetuned Small Language Models match or outperform LLMs on reproducing these annotations, and measure their usability for an opinion clustering task. We finally release GDN-CC-large, an automatically annotated corpus of 240k contributions, the largest annotated democratic consultation dataset to date.
2025
Comment mesurer les biais politiques des grands modèles de langue multilingues?
Paul Lerner | Laurène Cave | Hal Daumé | Léo Labat | Gaël Lejeune | Pierre-Antoine Lequeu | Benjamin Piwowarski | Nazanin Shafiabadi | François Yvon
Actes de l'atelier Ethic and Alignment of (Large) Language Models 2025 (EALM)
Paul Lerner | Laurène Cave | Hal Daumé | Léo Labat | Gaël Lejeune | Pierre-Antoine Lequeu | Benjamin Piwowarski | Nazanin Shafiabadi | François Yvon
Actes de l'atelier Ethic and Alignment of (Large) Language Models 2025 (EALM)
Nous proposons une nouvelle méthode pour mesurer les biais politiques des grands modèles de langue multilingues pour la traduction automatique, l’aide à la rédaction et le résumé automatique. Nous nous appuyons sur une représentation dense des opinions politiques exprimées dans les textes, apprise de façon faiblement supervisée.
2024
Évaluation de l’apport des chaînes de coréférences pour le liage d’entités
Léo Labat | Lauriane Aufrant
Actes de la 31ème Conférence sur le Traitement Automatique des Langues Naturelles, volume 1 : articles longs et prises de position
Léo Labat | Lauriane Aufrant
Actes de la 31ème Conférence sur le Traitement Automatique des Langues Naturelles, volume 1 : articles longs et prises de position
Ce travail propose de revisiter les approches de liage d’entités au regard de la tâche très prochequ’est la résolution de coréférence. Nous observons en effet différentes configurations (appuyéespar l’exemple) où le reste de la chaîne de coréférence peut fournir des indices utiles pour améliorerla désambiguïsation. Guidés par ces motivations théoriques, nous menons une analyse d’erreursaccompagnée d’expériences oracles qui confirment le potentiel de stratégies de combinaison deprédictions au sein de la chaîne de coréférence (jusqu’à 4.3 F1 sur les mentions coréférentes en anglais). Nousesquissons alors une première preuve de concept de combinaison par vote, en explorant différentesheuristiques de pondération, qui apporte des gains modestes mais interprétables.