Jörg Schlötterer


2023

pdf
Privacy-Preserving Knowledge Transfer through Partial Parameter Sharing
Paul Youssef | Jörg Schlötterer | Christin Seifert
Proceedings of the 5th Clinical Natural Language Processing Workshop

Valuable datasets that contain sensitive information are not shared due to privacy and copyright concerns. This hinders progress in many areas and prevents the use of machine learning solutions to solve relevant tasks. One possible solution is sharing models that are trained on such datasets. However, this is also associated with potential privacy risks due to data extraction attacks. In this work, we propose a solution based on sharing parts of the model’s parameters, and using a proxy dataset for complimentary knowledge transfer. Our experiments show encouraging results, and reduced risk to potential training data identification attacks. We present a viable solution to sharing knowledge with data-disadvantaged parties, that do not have the resources to produce high-quality data, with reduced privacy risks to the sharing parties. We make our code publicly available.

2022

pdf
Patient-friendly Clinical Notes: Towards a new Text Simplification Dataset
Jan Trienes | Jörg Schlötterer | Hans-Ulrich Schildhaus | Christin Seifert
Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)

Automatic text simplification can help patients to better understand their own clinical notes. A major hurdle for the development of clinical text simplification methods is the lack of high quality resources. We report ongoing efforts in creating a parallel dataset of professionally simplified clinical notes. Currently, this corpus consists of 851 document-level simplifications of German pathology reports. We highlight characteristics of this dataset and establish first baselines for paragraph-level simplification.