SummVD : An efficient approach for unsupervised topic-based text summarization
Gabriel Shenouda, Aurélien Bossard, Oussama Ayoub, Christophe Rodrigues
Abstract
This paper introduces a new method, SummVD, for automatic unsupervised extractive summarization. This method is based on singular value decomposition, a linear method in the number of words, in order to reduce the dimensionality of word embeddings and propose a representation of words on a small number of dimensions, each representing a hidden topic. It also uses word clustering to reduce the vocabulary size. This representation, specific to one document, reduces the noise brought by several dimensions of the embeddings that are useless in a restricted context. It is followed by a linear sentence extraction heuristic. This makes SummVD an efficient method for text summarization. We evaluate SummVD using several corpora of different nature (news, scientific articles, social network). Our method outperforms in effectiveness recent extractive approaches. Moreover, SummVD requires low resources, in terms of data and computing power. So it can be run on long single documents such as scientific papers as much as large multi-document corpora and is fast enough to be used in live summarization systems.- Anthology ID:
- 2022.aacl-main.38
- Volume:
- Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- November
- Year:
- 2022
- Address:
- Online only
- Editors:
- Yulan He, Heng Ji, Sujian Li, Yang Liu, Chua-Hui Chang
- Venues:
- AACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 501–511
- Language:
- URL:
- https://aclanthology.org/2022.aacl-main.38
- DOI:
- 10.18653/v1/2022.aacl-main.38
- Cite (ACL):
- Gabriel Shenouda, Aurélien Bossard, Oussama Ayoub, and Christophe Rodrigues. 2022. SummVD : An efficient approach for unsupervised topic-based text summarization. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 501–511, Online only. Association for Computational Linguistics.
- Cite (Informal):
- SummVD : An efficient approach for unsupervised topic-based text summarization (Shenouda et al., AACL-IJCNLP 2022)
- PDF:
- https://preview.aclanthology.org/landing_page/2022.aacl-main.38.pdf