Qianwen Guan

2026

This position paper argues that the under-representation of social science tasks in contemporary LLM benchmarks limits advances in both LLM evaluation and social scientific inquiry. Benchmarks — standardized tools for assessing computational systems — are pivotal in the development of artificial intelligence (AI), including large language models (LLMs). Benchmarks do more than measure progress — they actively structure it, shaping reputations, research agendas, and commercial outcomes. Despite this central role, the social sciences are largely absent from mainstream evaluation frameworks, even though scholars in these fields generate dozens of rigorously annotated, context-sensitive datasets each year. Integrating this work into benchmark design could significantly improve the generalization and robustness of AI models. In turn, models trained on social scientific tasks would likely yield better performance on classic and contemporary tasks in disciplines as diverse as history, sociology, political science or economics. This is all the more pressing as these disciplines are quickly turning to LLMs for assistance. To address this gap, we introduce BenCSSmark, a benchmark composed of datasets annotated by computational social scientists. By integrating social scientific perspectives into benchmarking, BenCSSmark seeks to promote more robust, transparent, and socially relevant AI systems and to foster efficient collaboration.

We release Pantagruel models, a new family of self-supervised encoder models for French text and speech. Instead of predicting modality-tailored targets such as textual tokens or speech units, Pantagruel learns contextualized target representations in the feature space, allowing modality-specific encoders to capture linguistic and acoustic regularities more effectively. Separate models are pre-trained on large-scale French corpora, including Wikipedia, OSCAR and CroissantLLM for text, together with MultilingualLibriSpeech, LeBenchmark, and INA-100k for speech. INA-100k is a newly introduced 100,000-hour corpus of French audio derived from the archives of the Institut National de l’Audiovisuel (INA), the national repository of French radio and television broadcasts, providing highly diverse audio data. We evaluate Pantagruel across a broad range of downstream tasks spanning both modalities, including those from the standard French benchmarks such as FLUE or LeBenchmark. Across these tasks, Pantagruel models show competitive or superior performance compared to strong French baselines such as CamemBERT, FlauBERT, and LeBenchmark2.0, while maintaining a shared architecture that can seamlessly handle either speech or text inputs. These results confirm the effectiveness of feature-space self-supervised objectives for French representation learning and highlight Pantagruel as a robust foundation for multimodal speech-text understanding.

2016

pdf bib abs

La perception des séquences consonantiques non-natives par les locuteurs monolingues de mandarin (Perception of non-native consonant sequences by Mandarin monolingual speakers)
Qianwen Guan | Harim Kwon
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP

Cette étude examine le rôle de la structure phonotactique native et des facteurs phonétiques dans la perception des séquences consonantiques non-natives. Des locuteurs monolingues de mandarin ont été testés dans les deux expériences suivantes: dans la première expérience, les locuteurs ont du décider s’ils entendaient une voyelle entre deux consonnes en écoutant des séquences intervocaliques-CC (akta) et leurs contrôles CVC (akata). Les participants mandarins monolingues ont tendance à percevoir une voyelle entre deux consonnes dans les deux séquences CC et CVC. Mais le pourcentage de la voyelle perçue varie selon les différentes séquences. Dans la deuxième expérience, les mêmes participants ont écouté des séquences CC initiales et intervocaliques (ktapa, akta) ainsi que CVC (katapa, akata) et les ont transcrites en Pinyin. Les stratégies observées dans la transcription: l’épenthèse, la métathèse, l’omission de C1 et celle de C2, montrent que les participants sont sensibles aux facteurs phonétiques. Les résultats des deux expériences suggèrent que la phonotactique native ainsi que des facteurs phonétiques affectent la perception des séquences non-natives.

Qianwen Guan

2026

2016

Co-authors

Venues