On the Usage of Semantics, Syntax, and Morphology for Noun Classification in IsiZulu
Imaan Sayed, Zola Mahlaza, Alexander van der Leek, Jonathan Mopp, C. Maria Keet
Abstract
There is limited work aimed at solving the core task of noun classification for Nguni languages. The task focuses on identifying the semantic categorisation of each noun and plays a crucial role in the ability to form semantically and morphologically valid sentences. The work by Byamugisha (2022) was the first to tackle the problem for a related, but non-Nguni, language. While there have been efforts to replicate it for a Nguni language, there has been no effort focused on comparing the technique used in the original work vs. contemporary neural methods or a number of traditional machine learning classification techniques that do not rely on human-guided knowledge to the same extent. We reproduce Byamugisha (2022)’s work with different configurations to account for differences in access to datasets and resources, compare the approach with a pre-trained transformer-based model, and traditional machine learning models that relyon less human-guided knowledge. The newly created data-driven models outperform the knowledge-infused models, with the best performing models achieving an F1 score of 0.97.- Anthology ID:
- 2025.resourceful-1.23
- Volume:
- Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025)
- Month:
- March
- Year:
- 2025
- Address:
- Tallinn, Estonia
- Editors:
- Špela Arhar Holdt, Nikolai Ilinykh, Barbara Scalvini, Micaella Bruton, Iben Nyholm Debess, Crina Madalina Tudor
- Venues:
- RESOURCEFUL | WS
- SIG:
- Publisher:
- University of Tartu Library, Estonia
- Note:
- Pages:
- 96–105
- Language:
- URL:
- https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.resourceful-1.23/
- DOI:
- Cite (ACL):
- Imaan Sayed, Zola Mahlaza, Alexander van der Leek, Jonathan Mopp, and C. Maria Keet. 2025. On the Usage of Semantics, Syntax, and Morphology for Noun Classification in IsiZulu. In Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025), pages 96–105, Tallinn, Estonia. University of Tartu Library, Estonia.
- Cite (Informal):
- On the Usage of Semantics, Syntax, and Morphology for Noun Classification in IsiZulu (Sayed et al., RESOURCEFUL 2025)
- PDF:
- https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.resourceful-1.23.pdf