Picking Apart Story Salads

Su Wang, Eric Holgate, Greg Durrett, Katrin Erk


Abstract
During natural disasters and conflicts, information about what happened is often confusing and messy, and distributed across many sources. We would like to be able to automatically identify relevant information and assemble it into coherent narratives of what happened. To make this task accessible to neural models, we introduce Story Salads, mixtures of multiple documents that can be generated at scale. By exploiting the Wikipedia hierarchy, we can generate salads that exhibit challenging inference problems. Story salads give rise to a novel, challenging clustering task, where the objective is to group sentences from the same narratives. We demonstrate that simple bag-of-words similarity clustering falls short on this task, and that it is necessary to take into account global context and coherence.
Anthology ID:
D18-1175
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1455–1465
Language:
URL:
https://aclanthology.org/D18-1175
DOI:
10.18653/v1/D18-1175
Bibkey:
Cite (ACL):
Su Wang, Eric Holgate, Greg Durrett, and Katrin Erk. 2018. Picking Apart Story Salads. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1455–1465, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Picking Apart Story Salads (Wang et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/D18-1175.pdf