Abstract
Text structuring is a fundamental step in NLG, especially when generating multi-sentential text. With the goal of fostering more general and data-driven approaches to text structuring, we propose the new and domain-independent NLG task of structuring and ordering a (possibly large) set of EDUs. We then present a solution for this task that combines neural dependency tree induction with pointer networks, and can be trained on large discourse treebanks that have only recently become available. Further, we propose a new evaluation metric that is arguably more suitable for our new task compared to existing content ordering metrics. Finally, we empirically show that our approach outperforms competitive alternatives on the proposed measure and is equivalent in performance with respect to previously established measures.- Anthology ID:
- 2020.findings-emnlp.281
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Trevor Cohn, Yulan He, Yang Liu
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3141–3152
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2020.findings-emnlp.281/
- DOI:
- 10.18653/v1/2020.findings-emnlp.281
- Cite (ACL):
- Grigorii Guz and Giuseppe Carenini. 2020. Towards Domain-Independent Text Structuring Trainable on Large Discourse Treebanks. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3141–3152, Online. Association for Computational Linguistics.
- Cite (Informal):
- Towards Domain-Independent Text Structuring Trainable on Large Discourse Treebanks (Guz & Carenini, Findings 2020)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2020.findings-emnlp.281.pdf