Keyphrase Generation with GANs in Low-Resources Scenarios

Giuseppe Lancioni, Saida S.Mohamed, Beatrice Portelli, Giuseppe Serra, Carlo Tasso


Abstract
Keyphrase Generation is the task of predicting Keyphrases (KPs), short phrases that summarize the semantic meaning of a given document. Several past studies provided diverse approaches to generate Keyphrases for an input document. However, all of these approaches still need to be trained on very large datasets. In this paper, we introduce BeGanKP, a new conditional GAN model to address the problem of Keyphrase Generation in a low-resource scenario. Our main contribution relies in the Discriminator’s architecture: a new BERT-based module which is able to distinguish between the generated and humancurated KPs reliably. Its characteristics allow us to use it in a low-resource scenario, where only a small amount of training data are available, obtaining an efficient Generator. The resulting architecture achieves, on five public datasets, competitive results with respect to the state-of-the-art approaches, using less than 1% of the training data.
Anthology ID:
2020.sustainlp-1.12
Volume:
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | sustainlp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
89–96
Language:
URL:
https://aclanthology.org/2020.sustainlp-1.12
DOI:
10.18653/v1/2020.sustainlp-1.12
Bibkey:
Cite (ACL):
Giuseppe Lancioni, Saida S.Mohamed, Beatrice Portelli, Giuseppe Serra, and Carlo Tasso. 2020. Keyphrase Generation with GANs in Low-Resources Scenarios. In Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, pages 89–96, Online. Association for Computational Linguistics.
Cite (Informal):
Keyphrase Generation with GANs in Low-Resources Scenarios (Lancioni et al., sustainlp 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.sustainlp-1.12.pdf
Video:
 https://slideslive.com/38939434
Data
KP20k