Abstract
Pretrained language models have become workhorses for various natural language processing (NLP) tasks, sparking a growing demand for enhanced interpretability and transparency. However, prevailing explanation methods, such as attention-based and gradient-based strategies, largely rely on linear approximations, potentially causing inaccuracies such as accentuating irrelevant input tokens. To mitigate the issue, we develop PromptExplainer, a novel method for explaining language models through prompt-based learning. PromptExplainer aligns the explanation process with the masked language modeling (MLM) task of pretrained language models and leverages the prompt-based learning framework for explanation generation. It disentangles token representations into the explainable embedding space using the MLM head and extracts discriminative features with a verbalizer to generate class-dependent explanations. Extensive experiments demonstrate that PromptExplainer significantly outperforms state-of-the-art explanation methods.- Anthology ID:
- 2024.findings-eacl.60
- Volume:
- Findings of the Association for Computational Linguistics: EACL 2024
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 882–895
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2024.findings-eacl.60/
- DOI:
- Cite (ACL):
- Zijian Feng, Hanzhang Zhou, Zixiao Zhu, and Kezhi Mao. 2024. PromptExplainer: Explaining Language Models through Prompt-based Learning. In Findings of the Association for Computational Linguistics: EACL 2024, pages 882–895, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- PromptExplainer: Explaining Language Models through Prompt-based Learning (Feng et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2024.findings-eacl.60.pdf