Piotr Mostowski


2022

pdf
Open Repository of the Polish Sign Language Corpus: Publication Project of the Polish Sign Language Corpus
Anna Kuder | Joanna Wójcicka | Piotr Mostowski | Paweł Rutkowski
Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources

Between 2010 and 2020, the research team of the Section for Sign Linguistics collected, annotated, and translated a large corpus of Polish Sign Language (polski język migowy, PJM). After this task was finished, a substantial part of the gathered materials was published online as the Open Repository of the Polish Sign Language Corpus. The current paper gives an overview of the process of converting the material from the Corpus into the Repository. If presents and explains the decisions made along the way and describes the process of data preparation and publication. There are two levels of access to the Repository, which are meant to fulfil the needs of a wide range of public users, from members of the Deaf community, through hearing students of PJM, sign language teachers and interpreters, to users with academic background. We describe how corpus material available in open access was prepared to be searchable by text type and elicitation tasks, by sociolinguistic metadata, and by translation into written Polish. We go on to explain how access for research purposes differs from open access. We present possible ways in which data gathered in the Repository may be used by members of the signing community in Poland and abroad.