DisComp: A Two-Stage Prompt Optimization Framework Combining Task-Agnostic and Task-Aware Compression

Liu Quancai; Haihui Fan; Jinchao Zhang; Lixiangfang Lixiangfang; Lichuanrong Lichuanrong; Bo Li

DisComp: A Two-Stage Prompt Optimization Framework Combining Task-Agnostic and Task-Aware Compression

Liu Quancai, Haihui Fan, Jinchao Zhang, Lixiangfang Lixiangfang, Lichuanrong Lichuanrong, Bo Li

Abstract

Large language models (LLMs) exhibit exceptional performance across a wide range of natural language processing tasks, often relying on lengthy prompts to harness their full capabilities. However, extended prompts can lead to substantial computational overhead and increased hardware demands, limiting the scalability and efficiency of such models. In this paper, we propose DisComp, a two-stage prompt compression framework based on knowledge distillation that combines task-agnostic and task-aware strategies, designed to efficiently compress prompt length without compromising performance.In the first stage, task-agnostic compression is achieved through knowledge distillation, transferring the summarization capabilities of a LLM to a smaller, more efficient model. The distillation process combines cross-entropy loss and keyword matching loss to ensure the smaller model generates concise and informative summaries. In the second stage, sentence-level pruning is applied, where sentences are ranked by relevance to the query, and irrelevant sentences are pruned to retain only task-critical information. We evaluate our method on three benchmark datasets, LongBench , ZeroSCROLLS and NaturalQuestions. The results show that DisComp significantly outperforms previous task-agnostic and task-specific compression approaches, and it is up to 6.56× faster at inference compared to the best token-level compression method.

Anthology ID:: 2025.findings-naacl.58
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1033–1044
Language:
URL:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.findings-naacl.58/
DOI:
Bibkey:
Cite (ACL):: Liu Quancai, Haihui Fan, Jinchao Zhang, Lixiangfang Lixiangfang, Lichuanrong Lichuanrong, and Bo Li. 2025. DisComp: A Two-Stage Prompt Optimization Framework Combining Task-Agnostic and Task-Aware Compression. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 1033–1044, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: DisComp: A Two-Stage Prompt Optimization Framework Combining Task-Agnostic and Task-Aware Compression (Quancai et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.findings-naacl.58.pdf

PDF Cite Search Fix data