Text Summarization with Pretrained Encoders

Yang Liu; Mirella Lapata

doi:10.18653/v1/D19-1387

Text Summarization with Pretrained Encoders

Abstract

Bidirectional Encoder Representations from Transformers (BERT) represents the latest incarnation of pretrained language models which have recently advanced a wide range of natural language processing tasks. In this paper, we showcase how BERT can be usefully applied in text summarization and propose a general framework for both extractive and abstractive models. We introduce a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences. Our extractive model is built on top of this encoder by stacking several inter-sentence Transformer layers. For abstractive summarization, we propose a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two (the former is pretrained while the latter is not). We also demonstrate that a two-staged fine-tuning approach can further boost the quality of the generated summaries. Experiments on three datasets show that our model achieves state-of-the-art results across the board in both extractive and abstractive settings.

Anthology ID:: D19-1387
Volume:: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:: EMNLP | IJCNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3730–3740
Language:
URL:: https://aclanthology.org/D19-1387
DOI:: 10.18653/v1/D19-1387
Bibkey:
Cite (ACL):: Yang Liu and Mirella Lapata. 2019. Text Summarization with Pretrained Encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3730–3740, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Text Summarization with Pretrained Encoders (Liu & Lapata, EMNLP-IJCNLP 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-3/D19-1387.pdf
Attachment:: D19-1387.Attachment.pdf
Code: nlpyang/PreSumm + additional community code
Data: CNN/Daily Mail, New York Times Annotated Corpus, XSum

PDF Search Code Attachment