Heting Gao


2023

pdf
Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition
Liming Wang | Junrui Ni | Heting Gao | Jialu Li | Kai Chieh Chang | Xulin Fan | Junkai Wu | Mark Hasegawa-Johnson | Chang Yoo
Findings of the Association for Computational Linguistics: ACL 2023

Existing supervised sign language recognition systems rely on an abundance of well-annotated data. Instead, an unsupervised speech-to-sign language recognition (SSR-U) system learns to translate between spoken and sign languages by observing only non-parallel speech and sign-language corpora. We propose speech2sign-U, a neural network-based approach capable of both character-level and word-level SSR-U. Our approach significantly outperforms baselines directly adapted from unsupervised speech recognition (ASR-U) models by as much as 50% recall@10 on several challenging American sign language corpora with various levels of sample sizes, vocabulary sizes, and audio and visual variability. The code is available at https://github.com/cactuswiththoughts/UnsupSpeech2Sign.gitcactuswiththoughts/UnsupSpeech2Sign.git.