Self-Sum: Teaching an Agent to Decide Itself When and What to Summarize

Hongru Wang, Rui Wang, Jushi Kai, Boyang Xue, Yongqi Li, Shijue Huang, Xiaoteng Ma, Jeff Z. Pan, Amos Storkey


Abstract
Long-horizon agents operate over extended sequences of reasoning and actions, but this inevitably accumulates context noise, resulting in excessive computational cost and information overload. Existing approaches commonly rely on fixed, rule-based summarization strategies (e.g., summarizing every few steps), which are inflexible, lack generalization, and often introduce irreversible information loss. We propose Self-Sum, a framework that empowers agents to autonomously decide when and what to summarize by modeling summarization as a first-class internal cognitive action, unified with external environmental actions within a multi-turn decision-making process. Specifically, we introduce a two-stage training recipe consisting of (i) a cold-start supervised fine-tuning stage that bootstraps summarization behavior, and (ii) a lightweight, summarization-aware reinforcement learning stage that refines summarization timing and content while discouraging unnecessary summaries. Experiments on multiple long-horizon benchmarks show that Self-Sum consistently outperforms no-summarization and rule-based baselines, with particularly strong gains in generalization. Analysis further reveals that Self-Sum learns to summarize sparsely at meaningful moments and preserves task-relevant information, highlighting the importance of jointly learning when and what to summarize for robust long-horizon agent behavior.
Anthology ID:
2026.findings-acl.447
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9178–9192
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.447/
DOI:
Bibkey:
Cite (ACL):
Hongru Wang, Rui Wang, Jushi Kai, Boyang Xue, Yongqi Li, Shijue Huang, Xiaoteng Ma, Jeff Z. Pan, and Amos Storkey. 2026. Self-Sum: Teaching an Agent to Decide Itself When and What to Summarize. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9178–9192, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Self-Sum: Teaching an Agent to Decide Itself When and What to Summarize (Wang et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.447.pdf
Checklist:
 2026.findings-acl.447.checklist.pdf