Brecht Nijman


2024

The GLOBALISE project’s digitalisation of the Dutch East India Company (VOC) archives raises questions about representing gender and marginalised identities. This paper outlines the challenges of accurately conveying gender information in the archives, highlighting issues such as the lack of self-identified gender descriptions, low representation of marginalised groups, colonial context, and multilingualism in the collection. Machine learning (ML) and machine translation (MT) used in the digitalisation process may amplify existing biases and under-representation. To address these issues, the paper proposes a gender policy for GLOBALISE, offering guidelines and methodologies for handling gender information and increasing the visibility of marginalised identities. The policy contributes to discussions about representing gender and diversity in digital historical research, ML, and MT.