GrEma: an HTR model for automated transcriptions of the Girifalco asylum’s medical records

Grazia Serratore, Emanuela Nicole Donato, Erika Pasceri, Antonietta Folino, Maria Chiaravalloti


Abstract
51 This paper deals with the digitization and transcription of medical records from the historical archive of the former psychiatric hospital of Girifalco (Catanzaro, Italy). The digitization is carried out in the premises where the asylum once stood and the historical archive is stored. Using the ScanSnap SV600 flatbed scanner, a copy compliant with the original for each document contained within the medical records is returned. Subsequently the different training phases of a Handwritten Text Recognition model with the Transkribus tool are presented. The transcription aims to obtain texts in an interoperable format, and it was applied exclusively to the clinical documents, such as the informative form, the nosological table and the clinical diary. This paper describes the training phases of a customized model for medical record transcription, named GrEma, presenting its benefits, limitations and possible future applications. This work was carried out ensuring compliance with current legislation on the protection of personal data. It also highlights the importance of digitization and transcription for the recovery and preservation of historical archives from former psychiatric institutions, ensuring these valuable documents remain accessible for future research and potential users.
Anthology ID:
2025.ldk-1.30
Volume:
Proceedings of the 5th Conference on Language, Data and Knowledge
Month:
September
Year:
2025
Address:
Naples, Italy
Editors:
Mehwish Alam, Andon Tchechmedjiev, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venues:
LDK | WS
SIG:
Publisher:
Unior Press
Note:
Pages:
301–311
Language:
URL:
https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.30/
DOI:
Bibkey:
Cite (ACL):
Grazia Serratore, Emanuela Nicole Donato, Erika Pasceri, Antonietta Folino, and Maria Chiaravalloti. 2025. GrEma: an HTR model for automated transcriptions of the Girifalco asylum’s medical records. In Proceedings of the 5th Conference on Language, Data and Knowledge, pages 301–311, Naples, Italy. Unior Press.
Cite (Informal):
GrEma: an HTR model for automated transcriptions of the Girifalco asylum’s medical records (Serratore et al., LDK 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ldl-25-ingestion/2025.ldk-1.30.pdf