Abstract
The first step in discourse analysis involves dividing a text into segments. We annotate the first high-quality small-scale medical corpus in English with discourse segments and analyze how well news-trained segmenters perform on this domain. While we expectedly find a drop in performance, the nature of the segmentation errors suggests some problems can be addressed earlier in the pipeline, while others would require expanding the corpus to a trainable size to learn the nuances of the medical domain.- Anthology ID:
 - W19-2704
 - Volume:
 - Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019
 - Month:
 - June
 - Year:
 - 2019
 - Address:
 - Minneapolis, MN
 - Editors:
 - Amir Zeldes, Debopam Das, Erick Maziero Galani, Juliano Desiderato Antonio, Mikel Iruskieta
 - Venue:
 - NAACL
 - SIG:
 - Publisher:
 - Association for Computational Linguistics
 - Note:
 - Pages:
 - 22–29
 - Language:
 - URL:
 - https://aclanthology.org/W19-2704
 - DOI:
 - 10.18653/v1/W19-2704
 - Cite (ACL):
 - Elisa Ferracane, Titan Page, Junyi Jessy Li, and Katrin Erk. 2019. From News to Medical: Cross-domain Discourse Segmentation. In Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019, pages 22–29, Minneapolis, MN. Association for Computational Linguistics.
 - Cite (Informal):
 - From News to Medical: Cross-domain Discourse Segmentation (Ferracane et al., NAACL 2019)
 - PDF:
 - https://preview.aclanthology.org/ingest-acl-2023-videos/W19-2704.pdf
 - Code
 - elisaF/news-med-segmentation