Abstract
While GPT-2 generates sentences that are remarkably human-like, longer documents can ramble and do not follow human-like writing structure. We study the problem of imposing structure on long-range text. We propose a novel controlled text generation task, sequentially controlled text generation, and identify a dataset, NewsDiscourse as a starting point for this task. We develop a sequential controlled text generation pipeline with generation and editing. We test different degrees of structural awareness and show that, in general, more structural awareness results in higher control- accuracy, grammaticality, coherency and topicality, approaching human-level writing performance.- Anthology ID:
- 2022.findings-emnlp.509
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2022
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6848–6866
- Language:
- URL:
- https://aclanthology.org/2022.findings-emnlp.509
- DOI:
- 10.18653/v1/2022.findings-emnlp.509
- Cite (ACL):
- Alexander Spangher, Yao Ming, Xinyu Hua, and Nanyun Peng. 2022. Sequentially Controlled Text Generation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6848–6866, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Sequentially Controlled Text Generation (Spangher et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2022.findings-emnlp.509.pdf