PromptExplainer: Explaining Language Models through Prompt-based Learning

Zijian Feng (冯子健); Hanzhang Zhou; Zixiao Zhu; Kezhi Mao

PromptExplainer: Explaining Language Models through Prompt-based Learning

Zijian Feng, Hanzhang Zhou, Zixiao Zhu, Kezhi Mao

Abstract

Pretrained language models have become workhorses for various natural language processing (NLP) tasks, sparking a growing demand for enhanced interpretability and transparency. However, prevailing explanation methods, such as attention-based and gradient-based strategies, largely rely on linear approximations, potentially causing inaccuracies such as accentuating irrelevant input tokens. To mitigate the issue, we develop PromptExplainer, a novel method for explaining language models through prompt-based learning. PromptExplainer aligns the explanation process with the masked language modeling (MLM) task of pretrained language models and leverages the prompt-based learning framework for explanation generation. It disentangles token representations into the explainable embedding space using the MLM head and extracts discriminative features with a verbalizer to generate class-dependent explanations. Extensive experiments demonstrate that PromptExplainer significantly outperforms state-of-the-art explanation methods.

Anthology ID:: 2024.findings-eacl.60
Volume:: Findings of the Association for Computational Linguistics: EACL 2024
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 882–895
Language:
URL:: https://preview.aclanthology.org/add-emnlp-2024-awards/2024.findings-eacl.60/
DOI:
Bibkey:
Cite (ACL):: Zijian Feng, Hanzhang Zhou, Zixiao Zhu, and Kezhi Mao. 2024. PromptExplainer: Explaining Language Models through Prompt-based Learning. In Findings of the Association for Computational Linguistics: EACL 2024, pages 882–895, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: PromptExplainer: Explaining Language Models through Prompt-based Learning (Feng et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/add-emnlp-2024-awards/2024.findings-eacl.60.pdf
Software:: 2024.findings-eacl.60.software.zip
Video:: https://preview.aclanthology.org/add-emnlp-2024-awards/2024.findings-eacl.60.mp4

PDF Cite Search Software Video Fix data