InterpreT: An Interactive Visualization Tool for Interpreting Transformers

Vasudev Lal, Arden Ma, Estelle Aflalo, Phillip Howard, Ana Simoes, Daniel Korat, Oren Pereg, Gadi Singer, Moshe Wasserblat


Abstract
With the increasingly widespread use of Transformer-based models for NLU/NLP tasks, there is growing interest in understanding the inner workings of these models, why they are so effective at a wide range of tasks, and how they can be further tuned and improved. To contribute towards this goal of enhanced explainability and comprehension, we present InterpreT, an interactive visualization tool for interpreting Transformer-based models. In addition to providing various mechanisms for investigating general model behaviours, novel contributions made in InterpreT include the ability to track and visualize token embeddings through each layer of a Transformer, highlight distances between certain token embeddings through illustrative plots, and identify task-related functions of attention heads by using new metrics. InterpreT is a task agnostic tool, and its functionalities are demonstrated through the analysis of model behaviours for two disparate tasks: Aspect Based Sentiment Analysis (ABSA) and the Winograd Schema Challenge (WSC).
Anthology ID:
2021.eacl-demos.17
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations
Month:
April
Year:
2021
Address:
Online
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
135–142
Language:
URL:
https://aclanthology.org/2021.eacl-demos.17
DOI:
10.18653/v1/2021.eacl-demos.17
Bibkey:
Cite (ACL):
Vasudev Lal, Arden Ma, Estelle Aflalo, Phillip Howard, Ana Simoes, Daniel Korat, Oren Pereg, Gadi Singer, and Moshe Wasserblat. 2021. InterpreT: An Interactive Visualization Tool for Interpreting Transformers. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 135–142, Online. Association for Computational Linguistics.
Cite (Informal):
InterpreT: An Interactive Visualization Tool for Interpreting Transformers (Lal et al., EACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.eacl-demos.17.pdf
Data
SuperGLUE