Sentence Centrality Revisited for Unsupervised Summarization

Hao Zheng, Mirella Lapata


Abstract
Single document summarization has enjoyed renewed interest in recent years thanks to the popularity of neural network models and the availability of large-scale datasets. In this paper we develop an unsupervised approach arguing that it is unrealistic to expect large-scale and high-quality training data to be available or created for different types of summaries, domains, or languages. We revisit a popular graph-based ranking algorithm and modify how node (aka sentence) centrality is computed in two ways: (a) we employ BERT, a state-of-the-art neural representation learning model to better capture sentential meaning and (b) we build graphs with directed edges arguing that the contribution of any two nodes to their respective centrality is influenced by their relative position in a document. Experimental results on three news summarization datasets representative of different languages and writing styles show that our approach outperforms strong baselines by a wide margin.
Anthology ID:
P19-1628
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6236–6247
Language:
URL:
https://aclanthology.org/P19-1628
DOI:
10.18653/v1/P19-1628
Bibkey:
Cite (ACL):
Hao Zheng and Mirella Lapata. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6236–6247, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Sentence Centrality Revisited for Unsupervised Summarization (Zheng & Lapata, ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/P19-1628.pdf
Code
 mswellhao/PacSum
Data
New York Times Annotated Corpus