ISLTranslate: Dataset for Translating Indian Sign Language

Abhinav Joshi, Susmit Agrawal, Ashutosh Modi


Abstract
Sign languages are the primary means of communication for many hard-of-hearing people worldwide. Recently, to bridge the communication gap between the hard-of-hearing community and the rest of the population, several sign language translation datasets have been proposed to enable the development of statistical sign language translation systems. However, there is a dearth of sign language resources for the Indian sign language. This resource paper introduces ISLTranslate, a translation dataset for continuous Indian Sign Language (ISL) consisting of 31k ISL-English sentence/phrase pairs. To the best of our knowledge, it is the largest translation dataset for continuous Indian Sign Language. We provide a detailed analysis of the dataset. To validate the performance of existing end-to-end Sign language to spoken language translation systems, we benchmark the created dataset with a transformer-based model for ISL translation.
Anthology ID:
2023.findings-acl.665
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10466–10475
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/2023.findings-acl.665/
DOI:
10.18653/v1/2023.findings-acl.665
Bibkey:
Cite (ACL):
Abhinav Joshi, Susmit Agrawal, and Ashutosh Modi. 2023. ISLTranslate: Dataset for Translating Indian Sign Language. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10466–10475, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
ISLTranslate: Dataset for Translating Indian Sign Language (Joshi et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/2023.findings-acl.665.pdf