Learning From the Source Document: Unsupervised Abstractive Summarization

Haojie Zhuang, Wei Emma Zhang, Jian Yang, Congbo Ma, Yutong Qu, Quan Z. Sheng


Abstract
Most of the state-of-the-art methods for abstractive text summarization are under supervised learning settings, while heavily relying on high-quality and large-scale parallel corpora. In this paper, we remove the need for reference summaries and present an unsupervised learning method SCR (Summarize, Contrast and Review) for abstractive summarization, which leverages contrastive learning and is the first work to apply contrastive learning for unsupervised abstractive summarization. Particularly, we use the true source documents as positive source document examples, and strategically generated fake source documents as negative source document examples to train the model to generate good summaries. Furthermore, we consider and improve the writing quality of the generated summaries by guiding them to be similar to human-written texts. The promising results on extensive experiments show that SCR outperforms other unsupervised abstractive summarization baselines, which demonstrates its effectiveness.
Anthology ID:
2022.findings-emnlp.309
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4194–4205
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.309
DOI:
Bibkey:
Cite (ACL):
Haojie Zhuang, Wei Emma Zhang, Jian Yang, Congbo Ma, Yutong Qu, and Quan Z. Sheng. 2022. Learning From the Source Document: Unsupervised Abstractive Summarization. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4194–4205, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Learning From the Source Document: Unsupervised Abstractive Summarization (Zhuang et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.findings-emnlp.309.pdf