Prior Knowledge and Memory Enriched Transformer for Sign Language Translation

Tao Jin, Zhou Zhao, Meng Zhang, Xingshan Zeng


Abstract
This paper attacks the challenging problem of sign language translation (SLT), which involves not only visual and textual understanding but also additional prior knowledge learning (i.e. performing style, syntax). However, the majority of existing methods with vanilla encoder-decoder structures fail to sufficiently explore all of them. Based on this concern, we propose a novel method called Prior knowledge and memory Enriched Transformer (PET) for SLT, which incorporates the auxiliary information into vanilla transformer. Concretely, we develop gated interactive multi-head attention which associates the multimodal representation and global signing style with adaptive gated functions. One Part-of-Speech (POS) sequence generator relies on the associated information to predict the global syntactic structure, which is thereafter leveraged to guide the sentence generation. Besides, considering that the visual-textual context information, and additional auxiliary knowledge of a word may appear in more than one video, we design a multi-stream memory structure to obtain higher-quality translations, which stores the detailed correspondence between a word and its various relevant information, leading to a more comprehensive understanding for each word. We conduct extensive empirical studies on RWTH-PHOENIX-Weather-2014 dataset with both signer-dependent and signer-independent conditions. The quantitative and qualitative experimental results comprehensively reveal the effectiveness of PET.
Anthology ID:
2022.findings-acl.297
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3766–3775
Language:
URL:
https://aclanthology.org/2022.findings-acl.297
DOI:
10.18653/v1/2022.findings-acl.297
Bibkey:
Cite (ACL):
Tao Jin, Zhou Zhao, Meng Zhang, and Xingshan Zeng. 2022. Prior Knowledge and Memory Enriched Transformer for Sign Language Translation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3766–3775, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Prior Knowledge and Memory Enriched Transformer for Sign Language Translation (Jin et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.findings-acl.297.pdf
Data
PHOENIX14T