Jérémie Bogaert

2026

Explaining Explanations: Interpretability Methods for Discourse Analysis of Transformer Attention Maps
Louis Escouflaire | Jérémie Bogaert | Antonin Descampe | Cédrick Fairon | Francois-Xavier Standaert
Proceedings of the Fifteenth Language Resources and Evaluation Conference

While LLMs have achieved state-of-the-art performance in NLP, their opacity hinders a human understanding of their predictions. Standard explainability techniques often prioritize technical faithfulness over linguistic plausibility. This paper argues for an interdisciplinary approach that integrates discourse analysis to critically interpret model explanations. We conduct a case study using CamemBERT, fine-tuned to classify French journalistic texts as news or opinion. We employ Layer-wise Relevance Propagation to generate attention maps for 1,000 test articles and analyze the token-level relevance scores through both in-depth qualitative analysis and a quantitative ranking of high-attention tokens. Our findings reveal that CamemBERT successfully captures genre-specific linguistic markers: it attends to cues of reported speech and temporal anchors in news, and to expressive punctuation, evaluative adjectives, and first-person pronouns in opinion. The discourse-analytic lens moves us beyond superficial observations, demonstrating how the model interprets features like punctuation as structural or stylistic conventions. We argue that integrating linguistic expertise into the explainability pipeline yields more nuanced, human-readable explanations.

2024

pdf bib

Sensibilité des explications à l’aléa des grands modèles de langage : le cas de la classification de textes journalistiques [Sensitivity of Explanations to the Randomness of Large Language Models: a Case Study on Journalistic Text Classification]
Jérémie Bogaert | Marie-Catherine de Marneffe | Antonin Descampe | Louis Escouflaire | Cédrick Fairon | François-Xavier Standaert
Traitement Automatique des Langues, Volume 64, Numéro 3 : Explicabilité des modèles de TAL [Explainability of NLP models]

Co-authors

Venues

LREC1
TAL1

Fix author