CondenseFlow: Scalable Latent Space Collaboration via Semantic Compression for Multi-Agent Systems

Xiaoyu Chen; Fengge Wu; Zhao Junsuo; Yun Fan

CondenseFlow: Scalable Latent Space Collaboration via Semantic Compression for Multi-Agent Systems

Xiaoyu Chen, Fengge Wu, Zhao Junsuo, Yun Fan

Abstract

Full-state latent communication in LLM-based multi-agent systems offers richer semantics than text but suffers from memory overhead scaling linearly with collaboration rounds. We propose CondenseFlow, which introduces the Latent Thought Condenser (LTC)—a lightweight module using learnable semantic probes to compress KV caches into fixed-size representations, achieving 𝒪(1) communication complexity regardless of context length. We theoretically prove that compression error is bounded by attention concentration and accumulates controllably across rounds. On seven benchmarks spanning six models, CondenseFlow reduces KV cache memory by over 99% and inference latency by approximately 20% compared to dense transfer with negligible accuracy degradation, while outperforming text-based methods by 1.7 percentage points on average across all configurations. Code is available at https://github.com/xxy33/condenseflow.

Anthology ID:: 2026.findings-acl.669
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13694–13712
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.669/
DOI:
Bibkey:
Cite (ACL):: Xiaoyu Chen, Fengge Wu, Zhao Junsuo, and Yun Fan. 2026. CondenseFlow: Scalable Latent Space Collaboration via Semantic Compression for Multi-Agent Systems. In Findings of the Association for Computational Linguistics: ACL 2026, pages 13694–13712, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: CondenseFlow: Scalable Latent Space Collaboration via Semantic Compression for Multi-Agent Systems (Chen et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.669.pdf
Checklist:: 2026.findings-acl.669.checklist.pdf

PDF Cite Search Checklist Fix data