From News to Medical: Cross-domain Discourse Segmentation

Elisa Ferracane; Titan Page; Junyi Jessy Li; Katrin Erk

doi:10.18653/v1/W19-2704

From News to Medical: Cross-domain Discourse Segmentation

Elisa Ferracane, Titan Page, Junyi Jessy Li, Katrin Erk

Abstract

The first step in discourse analysis involves dividing a text into segments. We annotate the first high-quality small-scale medical corpus in English with discourse segments and analyze how well news-trained segmenters perform on this domain. While we expectedly find a drop in performance, the nature of the segmentation errors suggests some problems can be addressed earlier in the pipeline, while others would require expanding the corpus to a trainable size to learn the nuances of the medical domain.

Anthology ID:: W19-2704
Volume:: Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019
Month:: June
Year:: 2019
Address:: Minneapolis, MN
Editors:: Amir Zeldes, Debopam Das, Erick Maziero Galani, Juliano Desiderato Antonio, Mikel Iruskieta
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 22–29
Language:
URL:: https://aclanthology.org/W19-2704
DOI:: 10.18653/v1/W19-2704
Bibkey:
Cite (ACL):: Elisa Ferracane, Titan Page, Junyi Jessy Li, and Katrin Erk. 2019. From News to Medical: Cross-domain Discourse Segmentation. In Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019, pages 22–29, Minneapolis, MN. Association for Computational Linguistics.
Cite (Informal):: From News to Medical: Cross-domain Discourse Segmentation (Ferracane et al., NAACL 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-2023-videos/W19-2704.pdf
Presentation:: W19-2704.Presentation.pdf
Code: elisaF/news-med-segmentation

PDF Search Code Presentation