CafGa: Customizing Feature Attributions to Explain Language Models

Alan David Boyle, Furui Cheng, Vilém Zouhar, Mennatallah El-Assady


Abstract
Feature attribution methods, such as SHAP and LIME, explain machine learning model predictions by quantifying the influence of each input component. When applying feature attributions to explain language models, a basic question is defining the interpretable components.Traditional feature attribution methods, commonly treat individual words as atomic units.This is highly computationally inefficient for long-form text and fails to capture semantic information that spans multiple words.To address this, we present CafGa, an interactive tool for generating and evaluating feature attribution explanations at customizable granularities. CafGa supports customized segmentation with user interaction and visualizes the deletion and insertion curves for explanation assessments. Through a user study involving participants of various expertise, we confirm CafGa’s usefulness, particularly among LLM practitioners. Explanations created using CafGa were also perceived as more useful compared to those generated by two fully automatic baseline methods: PartitionSHAP and MExGen, suggesting the effectiveness of the system.
Anthology ID:
2025.emnlp-demos.32
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Ivan Habernal, Peter Schulam, Jörg Tiedemann
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
461–470
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-demos.32/
DOI:
Bibkey:
Cite (ACL):
Alan David Boyle, Furui Cheng, Vilém Zouhar, and Mennatallah El-Assady. 2025. CafGa: Customizing Feature Attributions to Explain Language Models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 461–470, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
CafGa: Customizing Feature Attributions to Explain Language Models (Boyle et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-demos.32.pdf