Abstract
Cultural heritage data is a rich source of information about the history and culture development in the past. When used with due understanding of its intrinsic complexity it can both support research in social sciences and humanities, and become input for machine learning and artificial intelligence algorithms. In all cases ethical and contextual considerations can be encouraged when the relevant information is provided in a clear and well structured form to potential users before they begin to interact with the data. Proposed data-envelopes, basing on the existing documentation frameworks, address the particular needs and challenges of the cultural heritage field while combining machine-readability and user-friendliness. We develop and test data-envelopes usability on the data from the Huygens Institute for History and Culture of the Netherlands. This paper presents the following contributions: i) we highlight the complexity of CH data, featuring the unique ethical and contextual considerations they entail; ii) we evaluate and compare existing dataset documentation frameworks, examining their suitability for CH datasets; iii) we introduce the “data-envelope”–a machine readable adaptation of existing dataset documentation frameworks, to tackle the specificities of CH datasets. Its modular form is designed to serve not only the needs of machine learning (ML), but also and especially broader user groups varying from humanities scholars, governmental monitoring authorities to citizen scientists and the general public. Importantly, the data-envelope framework emphasises the legal and ethical dimensions of dataset documentation, facilitating compliance with evolving data protection regulations and enhancing the accountability of data stewardship in the cultural heritage sector. We discuss and invite the readers for further conversation on the topic of ethical considerations, and how the different audiences should be informed about the importance of datasets documentation management and their context.- Anthology ID:
- 2024.legal-1.9
- Volume:
- Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING 2024
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Ingo Siegert, Khalid Choukri
- Venues:
- LEGAL | WS
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 52–65
- Language:
- URL:
- https://aclanthology.org/2024.legal-1.9
- DOI:
- Cite (ACL):
- Mrinalini Luthra and Maria Eskevich. 2024. Data-Envelopes for Cultural Heritage: Going beyond Datasheets. In Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING 2024, pages 52–65, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Data-Envelopes for Cultural Heritage: Going beyond Datasheets (Luthra & Eskevich, LEGAL-WS 2024)
- PDF:
- https://preview.aclanthology.org/landing_page/2024.legal-1.9.pdf