Corpora of Disordered Speech in the Light of the GDPR: Two Use Cases from the DELAD Initiative

Henk van den Heuvel, Aleksei Kelli, Katarzyna Klessa, Satu Salaasti


Abstract
Corpora of disordered speech (CDS) are costly to collect and difficult to share due to personal data protection and intellectual property (IP) issues. In this contribution we discuss the legal grounds for processing CDS in the light of the GDPR, and illustrate these with two use cases from the DELAD context. One use case deals with clinical datasets and another with legacy data from Polish hearing-impaired children. For both cases, processing based on consent and on public interest are taken into consideration.
Anthology ID:
2020.lrec-1.406
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3317–3321
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.406
DOI:
Bibkey:
Cite (ACL):
Henk van den Heuvel, Aleksei Kelli, Katarzyna Klessa, and Satu Salaasti. 2020. Corpora of Disordered Speech in the Light of the GDPR: Two Use Cases from the DELAD Initiative. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3317–3321, Marseille, France. European Language Resources Association.
Cite (Informal):
Corpora of Disordered Speech in the Light of the GDPR: Two Use Cases from the DELAD Initiative (van den Heuvel et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2020.lrec-1.406.pdf