Nicolas Chesneau
2025
Towards Achieving Concept Completeness for Textual Concept Bottleneck Models
Milan Bhan
|
Yann Choho
|
Jean-Noël Vittaut
|
Nicolas Chesneau
|
Pierre Moreau
|
Marie-Jeanne Lesot
Findings of the Association for Computational Linguistics: EMNLP 2025
This paper proposes Complete Textual Concept Bottleneck Model (CT-CBM), a novel TCBM generator building concept labels in a fully unsupervised manner using a small language model, eliminating both the need for predefined human labeled concepts and LLM annotations. CT-CBM iteratively targets and adds important and identifiable concepts in the bottleneck layer to create a complete concept basis. CT-CBM achieves striking results against competitors in terms of concept basis completeness and concept detection accuracy, offering a promising solution to reliably enhance interpretability of NLP classifiers.
2024
Self-AMPLIFY: Improving Small Language Models with Self Post Hoc Explanations
Milan Bhan
|
Jean-Noël Vittaut
|
Nicolas Chesneau
|
Marie-Jeanne Lesot
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Incorporating natural language rationales in the prompt and In-Context Learning (ICL) have led to a significant improvement of Large Language Models (LLMs) performance. However, generating high-quality rationales require human-annotation or the use of auxiliary proxy models. In this work, we propose Self-AMPLIFY to automatically generate rationales from post hoc explanation methods applied to Small Language Models (SLMs) to improve their own performance. Self-AMPLIFY is a 3-step method that targets samples, generates rationales and builds a final prompt to leverage ICL. Self-AMPLIFY performance is evaluated on four SLMs and five datasets requiring strong reasoning abilities. Self-AMPLIFY achieves good results against competitors, leading to strong accuracy improvement. Self-AMPLIFY is the first method to apply post hoc explanation methods to autoregressive language models to generate rationales to improve their own performance in a fully automated manner.
2023
Enhancing textual counterfactual explanation intelligibility through Counterfactual Feature Importance
Milan Bhan
|
Jean-noel Vittaut
|
Nicolas Chesneau
|
Marie-jeanne Lesot
Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023)
Textual counterfactual examples explain a prediction by modifying the tokens of an initial instance in order to flip the outcome of a classifier. Even under sparsity constraint, counterfactual generation can lead to numerous changes from the initial text, making the explanation hard to understand. We propose Counterfactual Feature Importance, a method to make non-sparse counterfactual explanations more intelligible. Counterfactual Feature Importance assesses token change importance between an instance to explain and its counterfactual example. We develop two ways of computing Counterfactual Feature Importance, respectively based on classifier gradient computation and counterfactual generator loss evolution during counterfactual search. Then we design a global version of Counterfactual Feature Importance, providing rich information about semantic fields globally impacting classifier predictions. Counterfactual Feature Importance enables to focus on impacting parts of counterfactual explanations, making counterfactual explanations involving numerous changes more understandable.