Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition

Liming Wang; Junrui Ni; Heting Gao; Jialu Li; Kai Chieh Chang; Xulin Fan; Junkai Wu; Mark Hasegawa-Johnson; Chang Yoo

doi:10.18653/v1/2023.findings-acl.424

Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition

Liming Wang, Junrui Ni, Heting Gao, Jialu Li, Kai Chieh Chang, Xulin Fan, Junkai Wu, Mark Hasegawa-Johnson, Chang Yoo

Abstract

Existing supervised sign language recognition systems rely on an abundance of well-annotated data. Instead, an unsupervised speech-to-sign language recognition (SSR-U) system learns to translate between spoken and sign languages by observing only non-parallel speech and sign-language corpora. We propose speech2sign-U, a neural network-based approach capable of both character-level and word-level SSR-U. Our approach significantly outperforms baselines directly adapted from unsupervised speech recognition (ASR-U) models by as much as 50% recall@10 on several challenging American sign language corpora with various levels of sample sizes, vocabulary sizes, and audio and visual variability. The code is available at https://github.com/cactuswiththoughts/UnsupSpeech2Sign.gitcactuswiththoughts/UnsupSpeech2Sign.git.

Anthology ID:: 2023.findings-acl.424
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6785–6800
Language:
URL:: https://aclanthology.org/2023.findings-acl.424
DOI:: 10.18653/v1/2023.findings-acl.424
Bibkey:
Cite (ACL):: Liming Wang, Junrui Ni, Heting Gao, Jialu Li, Kai Chieh Chang, Xulin Fan, Junkai Wu, Mark Hasegawa-Johnson, and Chang Yoo. 2023. Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6785–6800, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition (Wang et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/proper-vol2-ingestion/2023.findings-acl.424.pdf
Video:: https://preview.aclanthology.org/proper-vol2-ingestion/2023.findings-acl.424.mp4

PDF Search Video