Extractive Summarization of Long Documents by Combining Global and Local Context

Wen Xiao, Giuseppe Carenini


Abstract
In this paper, we propose a novel neural single-document extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic. We evaluate the model on two datasets of scientific papers , Pubmed and arXiv, where it outperforms previous work, both extractive and abstractive models, on ROUGE-1, ROUGE-2 and METEOR scores. We also show that, consistently with our goal, the benefits of our method become stronger as we apply it to longer documents. Rather surprisingly, an ablation study indicates that the benefits of our model seem to come exclusively from modeling the local context, even for the longest documents.
Anthology ID:
D19-1298
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
3011–3021
Language:
URL:
https://aclanthology.org/D19-1298
DOI:
10.18653/v1/D19-1298
Bibkey:
Cite (ACL):
Wen Xiao and Giuseppe Carenini. 2019. Extractive Summarization of Long Documents by Combining Global and Local Context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3011–3021, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Extractive Summarization of Long Documents by Combining Global and Local Context (Xiao & Carenini, EMNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/D19-1298.pdf
Attachment:
 D19-1298.Attachment.zip
Code
 Wendy-Xiao/Extsumm_local_global_context
Data
PubmedarXiv