StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation

Yulun Du; Lydia Chilton

doi:10.18653/v1/2023.acl-long.171

StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation

Abstract

Collaborative stories, which are texts created through the collaborative efforts of multiple authors with different writing styles and intentions, pose unique challenges for NLP models. Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce StoryWars, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. We design 12 task types, comprising 7 understanding and 5 generation task types, on {pasted macro ‘STORYWARS’}, deriving 101 diverse story-related tasks in total as a multi-task benchmark covering all fully-supervised, few-shot, and zero-shot scenarios. Furthermore, we present our instruction-tuned model, InstructStory, for the story tasks showing that instruction tuning, in addition to achieving superior results in zero-shot and few-shot scenarios, can also obtain the best performance on the fully-supervised tasks in StoryWars, establishing strong multi-task benchmark performances on StoryWars.

Anthology ID:: 2023.acl-long.171
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3044–3062
Language:
URL:: https://aclanthology.org/2023.acl-long.171
DOI:: 10.18653/v1/2023.acl-long.171
Bibkey:
Cite (ACL):: Yulun Du and Lydia Chilton. 2023. StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3044–3062, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation (Du & Chilton, ACL 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/2023.acl-long.171.pdf
Video:: https://preview.aclanthology.org/nschneid-patch-5/2023.acl-long.171.mp4

PDF Search Video