Benchmarking LLMs on Semantic Overlap Summarization
John Salvador, Naman Bansal, Mousumi Akter, Souvika Sarkar, Anupam Das, Santu Karmaker
Abstract
Semantic Overlap Summarization (SOS) is a multi-document summarization task focused on extracting the common information shared cross alternative narratives which is a capability that is critical for trustworthy generation in domains such as news, law, and healthcare. We benchmark popular Large Language Models (LLMs) on SOS and introduce PrivacyPolicyPairs (3P), a new dataset of 135 high-quality samples from privacy policy documents, which complements existing resources and broadens domain coverage. Using the TELeR prompting taxonomy, we evaluate nearly one million LLM-generated summaries across two SOS datasets and conduct human evaluation on a curated subset. Our analysis reveals strong prompt sensitivity, identifies which automatic metrics align most closely with human judgments, and provides new baselines for future SOS research- Anthology ID:
- 2025.emnlp-main.1692
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 33340–33361
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1692/
- DOI:
- Cite (ACL):
- John Salvador, Naman Bansal, Mousumi Akter, Souvika Sarkar, Anupam Das, and Santu Karmaker. 2025. Benchmarking LLMs on Semantic Overlap Summarization. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 33340–33361, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Benchmarking LLMs on Semantic Overlap Summarization (Salvador et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1692.pdf