******************************************************************************************
*
* README - What Clued the AI Doctor In? On the Influence of Data Source and Quality
*          for Transformer-Based Medical Self-Disclosure Detection
*
******************************************************************************************

This file contains two datasets:

1. Expanded dataset
   - 3,919 instance dataset (an expansion to the benchmark MEDSD dataset)
2. Final dataset
   - 9,767 instance dataset used for model training 
   - Final dataset refers to the merged dataset comprising both the original MEDSD and our new dataset expansion after refinement.

We release these datasets upon request via email in compliance with the original MEDSD guidelines.
Thanks!