Abstract
This paper describes a novel application of semi-supervision for shallow discourse parsing. We use a neural approach for sequence tagging and focus on the extraction of explicit discourse arguments. First, additional unlabeled data is prepared for semi-supervised learning. From this data, weak annotations are generated in a first setting and later used in another setting to study performance differences. In our studies, we show an increase in the performance of our models that ranges between 2-10% F1 score. Further, we give some insights to the generated discourse annotations and compare the developed additional relations with the training relations. We release this new dataset of explicit discourse arguments to enable the training of large statistical models.- Anthology ID:
- 2020.lrec-1.139
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 1103–1109
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.139
- DOI:
- Cite (ACL):
- René Knaebel and Manfred Stede. 2020. Semi-Supervised Tri-Training for Explicit Discourse Argument Expansion. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1103–1109, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Semi-Supervised Tri-Training for Explicit Discourse Argument Expansion (Knaebel & Stede, LREC 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2020.lrec-1.139.pdf