A Spatial Model for Extracting and Visualizing Latent Discourse Structure in Text

Shashank Srivastava, Nebojsa Jojic


Abstract
We present a generative probabilistic model of documents as sequences of sentences, and show that inference in it can lead to extraction of long-range latent discourse structure from a collection of documents. The approach is based on embedding sequences of sentences from longer texts into a 2- or 3-D spatial grids, in which one or two coordinates model smooth topic transitions, while the third captures the sequential nature of the modeled text. A significant advantage of our approach is that the learned models are naturally visualizable and interpretable, as semantic similarity and sequential structure are modeled along orthogonal directions in the grid. We show that the method is effective in capturing discourse structures in narrative text across multiple genres, including biographies, stories, and newswire reports. In particular, our method outperforms or is competitive with state-of-the-art generative approaches on tasks such as predicting the outcome of a story, and sentence ordering.
Anthology ID:
P18-1211
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2268–2277
Language:
URL:
https://aclanthology.org/P18-1211
DOI:
10.18653/v1/P18-1211
Bibkey:
Cite (ACL):
Shashank Srivastava and Nebojsa Jojic. 2018. A Spatial Model for Extracting and Visualizing Latent Discourse Structure in Text. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2268–2277, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
A Spatial Model for Extracting and Visualizing Latent Discourse Structure in Text (Srivastava & Jojic, ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/P18-1211.pdf
Video:
 https://vimeo.com/285805558