Wenqing Yao


2022

pdf
GCPG: A General Framework for Controllable Paraphrase Generation
Kexin Yang | Dayiheng Liu | Wenqiang Lei | Baosong Yang | Haibo Zhang | Xue Zhao | Wenqing Yao | Boxing Chen
Findings of the Association for Computational Linguistics: ACL 2022

Controllable paraphrase generation (CPG) incorporates various external conditions to obtain desirable paraphrases. However, existing works only highlight a special condition under two indispensable aspects of CPG (i.e., lexically and syntactically CPG) individually, lacking a unified circumstance to explore and analyze their effectiveness. In this paper, we propose a general controllable paraphrase generation framework (GCPG), which represents both lexical and syntactical conditions as text sequences and uniformly processes them in an encoder-decoder paradigm. Under GCPG, we reconstruct commonly adopted lexical condition (i.e., Keywords) and syntactical conditions (i.e., Part-Of-Speech sequence, Constituent Tree, Masked Template and Sentential Exemplar) and study the combination of the two types. In particular, for Sentential Exemplar condition, we propose a novel exemplar construction method — Syntax-Similarity based Exemplar (SSE). SSE retrieves a syntactically similar but lexically different sentence as the exemplar for each target sentence, avoiding exemplar-side words copying problem. Extensive experiments demonstrate that GCPG with SSE achieves state-of-the-art performance on two popular benchmarks. In addition, the combination of lexical and syntactical conditions shows the significant controllable ability of paraphrase generation, and these empirical results could provide novel insight to user-oriented paraphrasing.

pdf
Self-supervised Product Title Rewrite for Product Listing Ads
Xue Zhao | Dayiheng Liu | Junwei Ding | Liang Yao | Mahone Yan | Huibo Wang | Wenqing Yao
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track

Product Listing Ads (PLAs) are primary online advertisements merchants pay to attract more customers. However, merchants prefer to stack various attributes to the title and neglect the fluency and information priority. These seller-created titles are not suitable for PLAs as they fail to highlight the core information in the visible part in PLAs titles. In this work, we present a title rewrite solution. Specifically, we train a self-supervised language model to generate high-quality titles in terms of fluency and information priority. Extensive offline test and real-world online test have demonstrated that our solution is effective in reducing the cost and gaining more profit as it lowers our CPC, CPB while improving CTR in the online test by a large amount.