Abstract
Recent advancements in self-attention neural network architectures have raised the bar for open-ended text generation. Yet, while current methods are capable of producing a coherent text which is several hundred words long, attaining control over the content that is being generated—as well as evaluating it—are still open questions. We propose a controlled generation task which is based on expanding a sequence of facts, expressed in natural language, into a longer narrative. We introduce human-based evaluation metrics for this task, as well as a method for deriving a large training dataset. We evaluate three methods on this task, based on fine-tuning pre-trained models. We show that while auto-regressive, unidirectional Language Models such as GPT2 produce better fluency, they struggle to adhere to the requested facts. We propose a plan-and-cloze model (using fine-tuned XLNet) which produces competitive fluency while adhering to the requested content.- Anthology ID:
- 2020.coling-main.211
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 2329–2345
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.211
- DOI:
- 10.18653/v1/2020.coling-main.211
- Cite (ACL):
- Eyal Orbach and Yoav Goldberg. 2020. Facts2Story: Controlling Text Generation by Key Facts. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2329–2345, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Facts2Story: Controlling Text Generation by Key Facts (Orbach & Goldberg, COLING 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.coling-main.211.pdf
- Code
- eyal-orbach/Facts2Story-data