Abstract
Paraphrases refer to texts that convey the same meaning with different expression forms. Traditional seq2seq-based models on paraphrase generation mainly focus on the fidelity while ignoring the diversity of outputs. In this paper, we propose a deep generative model to generate diverse paraphrases. We build our model based on the conditional generative adversarial network, and propose to incorporate a simple yet effective diversity loss term into the model in order to improve the diversity of outputs. The proposed diversity loss maximizes the ratio of pairwise distance between the generated texts and their corresponding latent codes, forcing the generator to focus more on the latent codes and produce diverse samples. Experimental results on benchmarks of paraphrase generation show that our proposed model can generate more diverse paraphrases compared with baselines.- Anthology ID:
- 2020.findings-emnlp.218
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Trevor Cohn, Yulan He, Yang Liu
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2411–2421
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2020.findings-emnlp.218/
- DOI:
- 10.18653/v1/2020.findings-emnlp.218
- Cite (ACL):
- Yue Cao and Xiaojun Wan. 2020. DivGAN: Towards Diverse Paraphrase Generation via Diversified Generative Adversarial Network. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2411–2421, Online. Association for Computational Linguistics.
- Cite (Informal):
- DivGAN: Towards Diverse Paraphrase Generation via Diversified Generative Adversarial Network (Cao & Wan, Findings 2020)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2020.findings-emnlp.218.pdf
- Data
- MS COCO