Abstract
RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining. In this paper, we demonstrate a simple, yet highly accurate discourse parser, incorporating recent contextual language models. Our parser establishes the new state-of-the-art (SOTA) performance for predicting structure and nuclearity on two key RST datasets, RST-DT and Instr-DT. We further demonstrate that pretraining our parser on the recently available large-scale “silver-standard” discourse treebank MEGA-DT provides even larger performance benefits, suggesting a novel and promising research direction in the field of discourse analysis.- Anthology ID:
- 2020.coling-main.337
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 3794–3805
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.337
- DOI:
- 10.18653/v1/2020.coling-main.337
- Cite (ACL):
- Grigorii Guz, Patrick Huber, and Giuseppe Carenini. 2020. Unleashing the Power of Neural Discourse Parsers - A Context and Structure Aware Approach Using Large Scale Pretraining. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3794–3805, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Unleashing the Power of Neural Discourse Parsers - A Context and Structure Aware Approach Using Large Scale Pretraining (Guz et al., COLING 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.coling-main.337.pdf
- Data
- Instructional-DT (Instr-DT), RST-DT