ConRPG: Paraphrase Generation using Contexts as Regularizer

Yuxian Meng, Xiang Ao, Qing He, Xiaofei Sun, Qinghong Han, Fei Wu, Chun Fan, Jiwei Li


Abstract
A long-standing issue with paraphrase generation is the lack of reliable supervision signals. In this paper, we propose a new unsupervised paradigm for paraphrase generation based on the assumption that the probabilities of generating two sentences with the same meaning given the same context should be the same. Inspired by this fundamental idea, we propose a pipelined system which consists of paraphrase candidate generation based on contextual language models, candidate filtering using scoring functions, and paraphrase model training based on the selected candidates. The proposed paradigm offers merits over existing paraphrase generation methods: (1) using the context regularizer on meanings, the model is able to generate massive amounts of high-quality paraphrase pairs; (2) the combination of the huge amount of paraphrase candidates and further diversity-promoting filtering yields paraphrases with more lexical and syntactic diversity; and (3) using human-interpretable scoring functions to select paraphrase pairs from candidates, the proposed framework provides a channel for developers to intervene with the data generation process, leading to a more controllable model. Experimental results across different tasks and datasets demonstrate that the proposed paradigm significantly outperforms existing paraphrase approaches in both supervised and unsupervised setups.
Anthology ID:
2021.emnlp-main.199
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2551–2562
Language:
URL:
https://aclanthology.org/2021.emnlp-main.199
DOI:
10.18653/v1/2021.emnlp-main.199
Bibkey:
Cite (ACL):
Yuxian Meng, Xiang Ao, Qing He, Xiaofei Sun, Qinghong Han, Fei Wu, Chun Fan, and Jiwei Li. 2021. ConRPG: Paraphrase Generation using Contexts as Regularizer. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2551–2562, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
ConRPG: Paraphrase Generation using Contexts as Regularizer (Meng et al., EMNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.emnlp-main.199.pdf
Data
COCO