Rigid Formats Controlled Text Generation

Piji Li, Haisong Zhang, Xiaojiang Liu, Shuming Shi


Abstract
Neural text generation has made tremendous progress in various tasks. One common characteristic of most of the tasks is that the texts are not restricted to some rigid formats when generating. However, we may confront some special text paradigms such as Lyrics (assume the music score is given), Sonnet, SongCi (classical Chinese poetry of the Song dynasty), etc. The typical characteristics of these texts are in three folds: (1) They must comply fully with the rigid predefined formats. (2) They must obey some rhyming schemes. (3) Although they are restricted to some formats, the sentence integrity must be guaranteed. To the best of our knowledge, text generation based on the predefined rigid formats has not been well investigated. Therefore, we propose a simple and elegant framework named SongNet to tackle this problem. The backbone of the framework is a Transformer-based auto-regressive language model. Sets of symbols are tailor-designed to improve the modeling performance especially on format, rhyme, and sentence integrity. We improve the attention mechanism to impel the model to capture some future information on the format. A pre-training and fine-tuning framework is designed to further improve the generation quality. Extensive experiments conducted on two collected corpora demonstrate that our proposed framework generates significantly better results in terms of both automatic metrics and the human evaluation.
Anthology ID:
2020.acl-main.68
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
742–751
Language:
URL:
https://aclanthology.org/2020.acl-main.68
DOI:
10.18653/v1/2020.acl-main.68
Bibkey:
Cite (ACL):
Piji Li, Haisong Zhang, Xiaojiang Liu, and Shuming Shi. 2020. Rigid Formats Controlled Text Generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 742–751, Online. Association for Computational Linguistics.
Cite (Informal):
Rigid Formats Controlled Text Generation (Li et al., ACL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.acl-main.68.pdf
Video:
 http://slideslive.com/38928912
Code
 lipiji/SongNet