Daniel Capurro


Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text
Jinghui Liu | Daniel Capurro | Anthony Nguyen | Karin Verspoor
Proceedings of the The 20th Annual Workshop of the Australasian Language Technology Association


Evaluating User Preferences in Machine Translation Using Conjoint Analysis
Katrin Kirchhoff | Daniel Capurro | Anne Turner
Proceedings of the 16th Annual conference of the European Association for Machine Translation

Statistical Section Segmentation in Free-Text Clinical Records
Michael Tepper | Daniel Capurro | Fei Xia | Lucy Vanderwende | Meliha Yetisgen-Yildiz
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Automatically segmenting and classifying clinical free text into sections is an important first step to automatic information retrieval, information extraction and data mining tasks, as it helps to ground the significance of the text within. In this work we describe our approach to automatic section segmentation of clinical records such as hospital discharge summaries and radiology reports, along with section classification into pre-defined section categories. We apply machine learning to the problems of section segmentation and section classification, comparing a joint (one-step) and a pipeline (two-step) approach. We demonstrate that our systems perform well when tested on three data sets, two for hospital discharge summaries and one for radiology reports. We then show the usefulness of section information by incorporating it in the task of extracting comorbidities from discharge summaries.