Evaluating Discourse in Structured Text Representations

Elisa Ferracane, Greg Durrett, Junyi Jessy Li, Katrin Erk


Abstract
Discourse structure is integral to understanding a text and is helpful in many NLP tasks. Learning latent representations of discourse is an attractive alternative to acquiring expensive labeled discourse data. Liu and Lapata (2018) propose a structured attention mechanism for text classification that derives a tree over a text, akin to an RST discourse tree. We examine this model in detail, and evaluate on additional discourse-relevant tasks and datasets, in order to assess whether the structured attention improves performance on the end task and whether it captures a text’s discourse structure. We find the learned latent trees have little to no structure and instead focus on lexical cues; even after obtaining more structured trees with proposed model modifications, the trees are still far from capturing discourse structure when compared to discourse dependency trees from an existing discourse parser. Finally, ablation studies show the structured attention provides little benefit, sometimes even hurting performance.
Anthology ID:
P19-1062
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
646–653
Language:
URL:
https://aclanthology.org/P19-1062
DOI:
10.18653/v1/P19-1062
Bibkey:
Cite (ACL):
Elisa Ferracane, Greg Durrett, Junyi Jessy Li, and Katrin Erk. 2019. Evaluating Discourse in Structured Text Representations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 646–653, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Evaluating Discourse in Structured Text Representations (Ferracane et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/P19-1062.pdf
Poster:
 P19-1062.Poster.pdf
Code
 elisaF/structured