Towards Holistic Summarization – Selecting Summaries, Not Sentences

Martin Hassel, Jonas Sjöbergh


Abstract
In this paper we present a novel method for automatic text summarization through text extraction, using computational semantics. The new idea is to view all the extracted text as a whole and compute a score for the total impact of the summary, instead of ranking for instance individual sentences. A greedy search strategy is used to search through the space of possible summaries and select the summary with the highest score of those found. The aim has been to construct a summarizer that can be quickly assembled, with the use of only a very few basic language tools, for languages that lack large amounts of structured or annotated data or advanced tools for linguistic processing. The proposed method is largely language independent, though we only evaluate it on English in this paper, using ROUGE-scores on texts from among others the DUC 2004 task 2. On this task our method performs better than several of the systems evaluated there, but worse than the best systems.
Anthology ID:
L06-1117
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/218_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Martin Hassel and Jonas Sjöbergh. 2006. Towards Holistic Summarization – Selecting Summaries, Not Sentences. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Towards Holistic Summarization – Selecting Summaries, Not Sentences (Hassel & Sjöbergh, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/218_pdf.pdf