PreSumm: Predicting Summarization Performance Without Summarizing

Steven Koniaev, Ori Ernst, Jackie CK Cheung


Abstract
Despite recent advancements in automatic summarization, state-of-the-art models do not summarize all documents equally well, raising the question: why? While prior research has extensively analyzed summarization models, little attention has been given to the role of document characteristics in influencing summarization performance.In this work, we explore two key research questions. First, do documents exhibit consistent summarization quality across multiple systems? If so, can we predict a document’s summarization performance without generating a summary? We answer both questions affirmatively and introduce PreSumm, a novel task in which a system predicts summarization performance based solely on the source document. Our analysis sheds light on common properties of documents with low PreSumm scores, revealing that they often suffer from coherence issues, complex content, or a lack of a clear main theme.In addition, we demonstrate PreSumm’s practical utility in two key applications: improving hybrid summarization workflows by identifying documents that require manual summarization and enhancing dataset quality by filtering outliers and noisy documents.Overall, our findings highlight the critical role of document properties in summarization performance and offer insights into the limitations of current systems that could serve as the basis for future improvements.
Anthology ID:
2025.findings-acl.940
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
18289–18305
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.findings-acl.940/
DOI:
10.18653/v1/2025.findings-acl.940
Bibkey:
Cite (ACL):
Steven Koniaev, Ori Ernst, and Jackie CK Cheung. 2025. PreSumm: Predicting Summarization Performance Without Summarizing. In Findings of the Association for Computational Linguistics: ACL 2025, pages 18289–18305, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
PreSumm: Predicting Summarization Performance Without Summarizing (Koniaev et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.findings-acl.940.pdf