Abstract
Recent approaches to Sign Language Production (SLP) have adopted spoken language Neural Machine Translation (NMT) architectures, applied without sign-specific modifications. In addition, these works represent sign language as a sequence of skeleton pose vectors, projected to an abstract representation with no inherent skeletal structure. In this paper, we represent sign language sequences as a skeletal graph structure, with joints as nodes and both spatial and temporal connections as edges. To operate on this graphical structure, we propose Skeletal Graph Self-Attention (SGSA), a novel graphical attention layer that embeds a skeleton inductive bias into the SLP model. Retaining the skeletal feature representation throughout, we directly apply a spatio-temporal adjacency matrix into the self-attention formulation. This provides structure and context to each skeletal joint that is not possible when using a non-graphical abstract representation, enabling fluid and expressive sign language production. We evaluate our Skeletal Graph Self-Attention architecture on the challenging RWTH-PHOENIX-Weather-2014T (PHOENIX14T) dataset, achieving state-of-the-art back translation performance with an 8% and 7% improvement over competing methods for the dev and test sets.- Anthology ID:
- 2022.sltat-1.15
- Volume:
- Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, John C. McDonald, Dimitar Shterionov, Rosalee Wolfe
- Venue:
- SLTAT
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 95–102
- Language:
- URL:
- https://aclanthology.org/2022.sltat-1.15
- DOI:
- Cite (ACL):
- Ben Saunders, Necati Cihan Camgöz, and Richard Bowden. 2022. Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production. In Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives, pages 95–102, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production (Saunders et al., SLTAT 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.sltat-1.15.pdf
- Data
- PHOENIX14T, RWTH-PHOENIX-Weather 2014 T