Part-of-Speech tagging Spanish Sign Language data and its applications in Sign Language machine translation
Euan McGill, Luis Chiruzzo, Santiago Egea Gómez, Horacio Saggion
Abstract
This paper examines the use of manually part-of-speech tagged sign language gloss data in the Text2Gloss and Gloss2Text translation tasks, as well as running an LSTM-based sequence labelling model on the same glosses for automatic part-of-speech tagging. We find that a combination of tag-enhanced glosses and pretraining the neural model positively impacts performance in the translation tasks. The results of the tagging task are limited, but provide a methodological framework for further research into tagging sign language gloss data.- Anthology ID:
- 2023.resourceful-1.10
- Volume:
- Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023)
- Month:
- May
- Year:
- 2023
- Address:
- Tórshavn, the Faroe Islands
- Venue:
- RESOURCEFUL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 70–76
- Language:
- URL:
- https://aclanthology.org/2023.resourceful-1.10
- DOI:
- Cite (ACL):
- Euan McGill, Luis Chiruzzo, Santiago Egea Gómez, and Horacio Saggion. 2023. Part-of-Speech tagging Spanish Sign Language data and its applications in Sign Language machine translation. In Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023), pages 70–76, Tórshavn, the Faroe Islands. Association for Computational Linguistics.
- Cite (Informal):
- Part-of-Speech tagging Spanish Sign Language data and its applications in Sign Language machine translation (McGill et al., RESOURCEFUL 2023)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2023.resourceful-1.10.pdf