Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu
Derwin Ngomane, Rooweither Mabuya, Jade Abbott, Vukosi Marivate
Abstract
In this study, we investigate the effectiveness of using cross-lingual word embeddings for zero-shot transfer learning between a language with an abundant resource, English, and a languagewith limited resource, isiZulu. IsiZulu is a part of the South African Nguni language family, which is characterised by complex agglutinating morphology. We use VecMap, an open source tool, to obtain cross-lingual word embeddings. To perform an extrinsic evaluation of the effectiveness of the embeddings, we train a news classifier on labelled English data in order to categorise unlabelled isiZulu data using zero-shot transfer learning. In our study, we found our model to have a weighted average F1-score of 0.34. Our findings demonstrate that VecMap generates modular word embeddings in the cross-lingual space that have an impact on the downstream classifier used for zero-shot transfer learning.- Anthology ID:
- 2023.rail-1.2
- Volume:
- Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023)
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Rooweither Mabuya, Don Mthobela, Mmasibidi Setaka, Menno Van Zaanen
- Venue:
- RAIL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11–17
- Language:
- URL:
- https://aclanthology.org/2023.rail-1.2
- DOI:
- 10.18653/v1/2023.rail-1.2
- Cite (ACL):
- Derwin Ngomane, Rooweither Mabuya, Jade Abbott, and Vukosi Marivate. 2023. Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu. In Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023), pages 11–17, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu (Ngomane et al., RAIL 2023)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2023.rail-1.2.pdf