KoSign Sign Language Translation Project: Introducing The NIASL2021 Dataset

Mathew Huerta-Enochian, Du Hui Lee, Hye Jin Myung, Kang Suk Byun, Jun Woo Lee


Abstract
We introduce a new sign language production (SLP) and sign language translation (SLT) dataset, NIASL2021, consisting of 201,026 Korean-KSL data pairs. KSL translations of Korean source texts are represented in three formats: video recordings, keypoint position data, and time-aligned gloss annotations for each hand (using a 7,989 sign vocabulary) and for eight different non-manual signals (NMS). We evaluated our sign language elicitation methodology and found that text-based prompting had a negative effect on translation quality in terms of naturalness and comprehension. We recommend distilling text into a visual medium before translating into sign language or adding a prompt-blind review step to text-based translation methodologies.
Anthology ID:
2022.sltat-1.9
Volume:
Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
SLTAT
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
59–66
Language:
URL:
https://aclanthology.org/2022.sltat-1.9
DOI:
Bibkey:
Cite (ACL):
Mathew Huerta-Enochian, Du Hui Lee, Hye Jin Myung, Kang Suk Byun, and Jun Woo Lee. 2022. KoSign Sign Language Translation Project: Introducing The NIASL2021 Dataset. In Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives, pages 59–66, Marseille, France. European Language Resources Association.
Cite (Informal):
KoSign Sign Language Translation Project: Introducing The NIASL2021 Dataset (Huerta-Enochian et al., SLTAT 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2022.sltat-1.9.pdf
Data
How2SignRWTH-PHOENIX-Weather 2014 T