Abstract
Query-based text summarization is aimed at extracting essential information that answers the query from original text. The answer is presented in a minimal, often predefined, number of words. In this paper we introduce a new unsupervised approach for query-based extractive summarization, based on the minimum description length (MDL) principle that employs Krimp compression algorithm (Vreeken et al., 2011). The key idea of our approach is to select frequent word sets related to a given query that compress document sentences better and therefore describe the document better. A summary is extracted by selecting sentences that best cover query-related frequent word sets. The approach is evaluated based on the DUC 2005 and DUC 2006 datasets which are specifically designed for query-based summarization (DUC, 2005 2006). It competes with the best results.- Anthology ID:
- W17-1004
- Volume:
- Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Venue:
- MultiLing
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 22–31
- Language:
- URL:
- https://aclanthology.org/W17-1004
- DOI:
- 10.18653/v1/W17-1004
- Cite (ACL):
- Marina Litvak and Natalia Vanetik. 2017. Query-based summarization using MDL principle. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, pages 22–31, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Query-based summarization using MDL principle (Litvak & Vanetik, MultiLing 2017)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/W17-1004.pdf