BERT meets Shapley: Extending SHAP Explanations to Transformer-based Classifiers

Enja Kokalj, Blaž Škrlj, Nada Lavrač, Senja Pollak, Marko Robnik-Šikonja


Abstract
Transformer-based neural networks offer very good classification performance across a wide range of domains, but do not provide explanations of their predictions. While several explanation methods, including SHAP, address the problem of interpreting deep learning models, they are not adapted to operate on state-of-the-art transformer-based neural networks such as BERT. Another shortcoming of these methods is that their visualization of explanations in the form of lists of most relevant words does not take into account the sequential and structurally dependent nature of text. This paper proposes the TransSHAP method that adapts SHAP to transformer models including BERT-based text classifiers. It advances SHAP visualizations by showing explanations in a sequential manner, assessed by human evaluators as competitive to state-of-the-art solutions.
Anthology ID:
2021.hackashop-1.3
Volume:
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation
Month:
April
Year:
2021
Address:
Online
Editors:
Hannu Toivonen, Michele Boggia
Venue:
Hackashop
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16–21
Language:
URL:
https://aclanthology.org/2021.hackashop-1.3
DOI:
Bibkey:
Cite (ACL):
Enja Kokalj, Blaž Škrlj, Nada Lavrač, Senja Pollak, and Marko Robnik-Šikonja. 2021. BERT meets Shapley: Extending SHAP Explanations to Transformer-based Classifiers. In Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pages 16–21, Online. Association for Computational Linguistics.
Cite (Informal):
BERT meets Shapley: Extending SHAP Explanations to Transformer-based Classifiers (Kokalj et al., Hackashop 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2021.hackashop-1.3.pdf