From Keyterms to Context: Exploring Topic Description Generation in Scientific Corpora
Pierre Achkar, Satiyabooshan Murugaboopathy, Anne Kreuter, Tim Gollub, Martin Potthast, Yuri Campbell
Abstract
Topic models represent topics as ranked term lists, which are often hard to interpret in scientific domains. We explore Topic Description for Scientific Corpora, an approach to generating structured summaries for topic-specific document sets. We propose and investigate two LLM-based pipelines: Selective Context Summarisation (SCS), which uses maximum marginal relevance to select representative documents; and Compressed Context Summarisation (CCS), a hierarchical approach that compresses document sets through iterative summarisation. We evaluate both methods using SUPERT and multi-model LLM-as-a-Judge across three topic modeling backbones and three scientific corpora. Our preliminary results suggest that SCS tends to outperform CCS in quality and robustness, while CCS shows potential advantages on larger topics. Our findings highlight interesting trade-offs between selective and compressed strategies for topic-level summarisation in scientific domains. We release code and data for two of the three datasets.- Anthology ID:
- 2025.newsum-main.8
- Volume:
- Proceedings of The 5th New Frontiers in Summarization Workshop
- Month:
- November
- Year:
- 2025
- Address:
- Hybrid
- Editors:
- Yue Dong, Wen Xiao, Haopeng Zhang, Rui Zhang, Ori Ernst, Lu Wang, Fei Liu
- Venues:
- NewSum | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 102–122
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.newsum-main.8/
- DOI:
- Cite (ACL):
- Pierre Achkar, Satiyabooshan Murugaboopathy, Anne Kreuter, Tim Gollub, Martin Potthast, and Yuri Campbell. 2025. From Keyterms to Context: Exploring Topic Description Generation in Scientific Corpora. In Proceedings of The 5th New Frontiers in Summarization Workshop, pages 102–122, Hybrid. Association for Computational Linguistics.
- Cite (Informal):
- From Keyterms to Context: Exploring Topic Description Generation in Scientific Corpora (Achkar et al., NewSum 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.newsum-main.8.pdf