Pierre Marquis

2026

Credal Concept Bottleneck Models for Epistemic–Aleatoric Uncertainty Decomposition
Tanmoy Mukherjee | Thomas Bailleux | Pierre Marquis | Zied Bouraoui
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Concept Bottleneck Models (CBMs) predict through human-interpretable concepts, but they typically output point concept probabilities that conflate epistemic uncertainty (reducible model underspecification) with aleatoric uncertainty (irreducible input ambiguity). This makes concept-level uncertainty hard to interpret and, more importantly, hard to act upon. We introduce (Credal Ensemble Concept Estimation), a CBM framework that decomposes concept uncertainty by construction. represents each concept as a credal prediction (a probability interval), derives epistemic uncertainty from disagreement across diverse concept heads, and estimates aleatoric uncertainty via a dedicated ambiguity output trained to match annotator disagreement when available. The resulting signals support prescriptive decisions: automate low-uncertainty cases, prioritize data collection for high-epistemic cases, route high-aleatoric cases to human review, and abstain when both are high. Across several tasks, we show that epistemic uncertainty is positively associated with prediction errors, whereas aleatoric uncertainty closely tracks annotator disagreement, providing guidance beyond error correlation. Our implementation is available at the following link: https://github.com/Tankiit/Credal_Sets/tree/ensemble-credal-cbm

2025

pdf bib abs

Connecting Concept Layers and Rationales to Enhance Language Model Interpretability
Thomas Bailleux | Tanmoy Mukherjee | Pierre Marquis | Zied Bouraoui
Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)

With the introduction of large language models, NLP has undergone a paradigm shift where these models now serve as the backbone of most developed systems. However, while highly effective, they remain opaque and difficult to interpret, which limits their adoption in critical applications that require transparency and trust. Two major approaches aim to address this: rationale extraction, which highlights input spans that justify predictions, and concept bottleneck models, which make decisions through human-interpretable concepts. Yet each has limitations. Crucially, current models lack a unified framework that connects where a model looks (rationales) with why it makes a decision (concepts). We introduce CLARITY, a model that first selects key input spans, maps them to interpretable concepts, and then predicts using only those concepts. This design supports faithful, multi-level explanations and allows users to intervene at both the rationale and concept levels. CLARITY, achieves competitive accuracy while offering improved transparency and controllability.

Co-authors

Venues

ACL1
*SEM1

Fix author