Part-of-Speech tagging Spanish Sign Language data and its applications in Sign Language machine translation

Euan McGill, Luis Chiruzzo, Santiago Egea Gómez, Horacio Saggion


Abstract
This paper examines the use of manually part-of-speech tagged sign language gloss data in the Text2Gloss and Gloss2Text translation tasks, as well as running an LSTM-based sequence labelling model on the same glosses for automatic part-of-speech tagging. We find that a combination of tag-enhanced glosses and pretraining the neural model positively impacts performance in the translation tasks. The results of the tagging task are limited, but provide a methodological framework for further research into tagging sign language gloss data.
Anthology ID:
2023.resourceful-1.10
Volume:
Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023)
Month:
May
Year:
2023
Address:
Tórshavn, the Faroe Islands
Editors:
Nikolai Ilinykh, Felix Morger, Dana Dannélls, Simon Dobnik, Beáta Megyesi, Joakim Nivre
Venue:
RESOURCEFUL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
70–76
Language:
URL:
https://aclanthology.org/2023.resourceful-1.10
DOI:
Bibkey:
Cite (ACL):
Euan McGill, Luis Chiruzzo, Santiago Egea Gómez, and Horacio Saggion. 2023. Part-of-Speech tagging Spanish Sign Language data and its applications in Sign Language machine translation. In Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023), pages 70–76, Tórshavn, the Faroe Islands. Association for Computational Linguistics.
Cite (Informal):
Part-of-Speech tagging Spanish Sign Language data and its applications in Sign Language machine translation (McGill et al., RESOURCEFUL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.resourceful-1.10.pdf