HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

Dong Liu, Yanxuan Yu


Abstract
Semantic parsing of long documents remains challenging due to quadratic growth in pairwise composition and memory requirements. We introduce Hierarchical Segment-Graph Memory (HSGM), a novel framework that decomposes an input of length N into M meaningful segments, constructs Local Semantic Graphs on each segment, and extracts compact summary nodes to form a Global Graph Memory. HSGM supports incremental updates—only newly arrived segments incur local graph construction and summary-node integration—while Hierarchical Query Processing locates relevant segments via top-K retrieval over summary nodes and then performs fine-grained reasoning within their local graphs.Theoretically, HSGM reduces worst-case complexity from O(N2) to O\bigl(N\,k + (N/k)2\bigr),with segment size k ≪ N, and we derive Frobenius-norm bounds on the approximation error introduced by node summarization and sparsification thresholds. Empirically, on three benchmarks—long-document AMR parsing, segment-level semantic role labeling (OntoNotes), and legal event extraction—HSGM achieves 2–4× inference speedup, >60% reduction in peak memory, and ≥95% of baseline accuracy. Our approach unlocks scalable, accurate semantic modeling for ultra-long texts, enabling real-time and resource-constrained NLP applications.
Anthology ID:
2025.starsem-1.26
Volume:
Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025)
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Lea Frermann, Mark Stevenson
Venue:
*SEM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
328–337
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.starsem-1.26/
DOI:
Bibkey:
Cite (ACL):
Dong Liu and Yanxuan Yu. 2025. HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics. In Proceedings of the 14th Joint Conference on Lexical and Computational Semantics (*SEM 2025), pages 328–337, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics (Liu & Yu, *SEM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.starsem-1.26.pdf