UMUTeam at SemEval-2023 Task 12: Ensemble Learning of LLMs applied to Sentiment Analysis for Low-resource African Languages
José Antonio García-Díaz, Camilo Caparros-laiz, Ángela Almela, Gema Alcaráz-Mármol, María José Marín-Pérez, Rafael Valencia-García
Abstract
These working notes summarize the participation of the UMUTeam in the SemEval 2023 shared task: AfriSenti, focused on Sentiment Analysis in several African languages. Two subtasks are proposed, one in which each language is considered separately and another one in which all languages are merged. Our proposal to solve both subtasks is grounded on the combination of features extracted from several multilingual Large Language Models and a subset of language-independent linguistic features. Our best results are achieved with the African languages less represented in the training set: Xitsonga, a Mozambique dialect, with a weighted f1-score of 54.89\%; Algerian Arabic, with a weighted f1-score of 68.52\%; Swahili, with a weighted f1-score of 60.52\%; and Twi, with a weighted f1-score of 71.14%.- Anthology ID:
- 2023.semeval-1.38
- Volume:
- Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 285–292
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2023.semeval-1.38/
- DOI:
- 10.18653/v1/2023.semeval-1.38
- Cite (ACL):
- José Antonio García-Díaz, Camilo Caparros-laiz, Ángela Almela, Gema Alcaráz-Mármol, María José Marín-Pérez, and Rafael Valencia-García. 2023. UMUTeam at SemEval-2023 Task 12: Ensemble Learning of LLMs applied to Sentiment Analysis for Low-resource African Languages. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 285–292, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- UMUTeam at SemEval-2023 Task 12: Ensemble Learning of LLMs applied to Sentiment Analysis for Low-resource African Languages (García-Díaz et al., SemEval 2023)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2023.semeval-1.38.pdf