One Side of the Coin: Development of an ASL-English Parallel Corpus by Leveraging SRT Files

Rafael Treviño, Julie A. Hochgesang, Emily P. Shaw, Nic Willow


Abstract
We report on a method used to develop a sizable parallel corpus of English and American Sign Language (ASL). The effort is part of the Gallaudet University Documentation of ASL (GUDA) project, which is currently coordinated by an interdisciplinary team from the Department of Linguistics and the Department of Interpretation and Translation at Gallaudet University. Creation of the parallel corpus makes use of the available SRT (SubRip Subtitle) files of ASL videos that have been interpreted into or from English, or captioned into English. The corpus allows for one-way searches based on the English translation or interpretation, which is useful for translators, interpreters, and those conducting comparative analyses. We conclude with a discussion of important considerations for this method of constructing a parallel corpus, as well as next steps that will help to refine the development and utility of this type of corpus.
Anthology ID:
2020.signlang-1.36
Volume:
Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie A. Hochgesang, Jette Kristoffersen, Johanna Mesch
Venue:
SignLang
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
224–230
Language:
English
URL:
https://aclanthology.org/2020.signlang-1.36
DOI:
Bibkey:
Cite (ACL):
Rafael Treviño, Julie A. Hochgesang, Emily P. Shaw, and Nic Willow. 2020. One Side of the Coin: Development of an ASL-English Parallel Corpus by Leveraging SRT Files. In Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives, pages 224–230, Marseille, France. European Language Resources Association (ELRA).
Cite (Informal):
One Side of the Coin: Development of an ASL-English Parallel Corpus by Leveraging SRT Files (Treviño et al., SignLang 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2020.signlang-1.36.pdf