Structure-Aware Chunking for Abstractive Summarization of Long Legal Documents

Himadri Sonowal, Saisab Sadhu


Abstract
The efficacy of state-of-the-art abstractive summarization models is severely constrained by the extreme document lengths of legal judgments, which consistently surpass their fixed input capacities. The prevailing method, naive sequential chunking, is a discourse-agnostic process that induces context fragmentation and degrades summary coherence. This paper introduces Structure-Aware Chunking (SAC), a rhetorically-informed pre-processing pipeline that leverages the intrinsic logical structure of legal documents. We partition judgments into their constituent rhetorical strata—Facts, Arguments & Analysis, and Conclusion—prior to the summarization pass. We present and evaluate two SAC instantiations: a computationally efficient heuristic-based segmenter and a semantically robust LLM-driven approach. Empirical validation on the JUST-NLP 2025 L-SUMM shared task dataset reveals a nuanced trade-off: while our methods improve local, n-gram based metrics (ROUGE-2), they struggle to maintain global coherence (ROUGE-L). We identify this “coherence gap” as a critical challenge in chunk-based summarization and show that advanced LLM-based segmentation begins to bridge it.
Anthology ID:
2025.justnlp-main.19
Volume:
Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025)
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Ashutosh Modi, Saptarshi Ghosh, Asif Ekbal, Pawan Goyal, Sarika Jain, Abhinav Joshi, Shivani Mishra, Debtanu Datta, Shounak Paul, Kshetrimayum Boynao Singh, Sandeep Kumar
Venues:
JUSTNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
171–178
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.justnlp-main.19/
DOI:
Bibkey:
Cite (ACL):
Himadri Sonowal and Saisab Sadhu. 2025. Structure-Aware Chunking for Abstractive Summarization of Long Legal Documents. In Proceedings of the 1st Workshop on NLP for Empowering Justice (JUST-NLP 2025), pages 171–178, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):
Structure-Aware Chunking for Abstractive Summarization of Long Legal Documents (Sonowal & Sadhu, JUSTNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.justnlp-main.19.pdf