JoPA: Explaining Large Language Model’s Generation via Joint Prompt Attribution

Yurui Chang; Bochuan Cao; Yujia Wang; Jinghui Chen; Lu Lin

JoPA: Explaining Large Language Model’s Generation via Joint Prompt Attribution

Yurui Chang, Bochuan Cao, Yujia Wang, Jinghui Chen, Lu Lin

Abstract

Large Language Models (LLMs) have demonstrated impressive performances in complex text generation tasks. However, the contribution of the input prompt to the generated content still remains obscure to humans, underscoring the necessity of understanding the causality between input and output pairs. Existing works for providing prompt-specific explanation often confine model output to be classification or next-word prediction. Few initial attempts aiming to explain the entire language generation often treat input prompt texts independently, ignoring their combinatorial effects on the follow-up generation. In this study, we introduce a counterfactual explanation framework based on joint prompt attribution, JoPA, which aims to explain how a few prompt texts collaboratively influences the LLM’s complete generation. Particularly, we formulate the task of prompt attribution for generation interpretation as a combinatorial optimization problem, and introduce a probabilistic algorithm to search for the casual input combination in the discrete space. We define and utilize multiple metrics to evaluate the produced explanations, demonstrating both the faithfulness and efficiency of our framework.

Anthology ID:: 2025.acl-long.1074
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 22106–22122
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1074/
DOI:
Bibkey:
Cite (ACL):: Yurui Chang, Bochuan Cao, Yujia Wang, Jinghui Chen, and Lu Lin. 2025. JoPA: Explaining Large Language Model’s Generation via Joint Prompt Attribution. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 22106–22122, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: JoPA: Explaining Large Language Model’s Generation via Joint Prompt Attribution (Chang et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1074.pdf

PDF Cite Search Fix data