Abstract
In this paper, we describe our submissions to the SemEval-2023 contest. We tackled subtask 12 - “AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset”. We developed different models for 12 African languages and a 13th model for a multilingual dataset built from these 12 languages. We applied a wide variety of word and char n-grams based on their tf-idf values, 4 classical machine learning methods, 2 deep learning methods, and 3 oversampling methods. We used 12 sentiment lexicons and applied extensive hyperparameter tuning.- Anthology ID:
- 2023.semeval-1.49
- Volume:
- Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 365–378
- Language:
- URL:
- https://aclanthology.org/2023.semeval-1.49
- DOI:
- 10.18653/v1/2023.semeval-1.49
- Cite (ACL):
- Ron Keinan and Yaakov Hacohen-Kerner. 2023. JCT at SemEval-2023 Tasks 12 A and 12B: Sentiment Analysis for Tweets Written in Low-resource African Languages using Various Machine Learning and Deep Learning Methods, Resampling, and HyperParameter Tuning. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 365–378, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- JCT at SemEval-2023 Tasks 12 A and 12B: Sentiment Analysis for Tweets Written in Low-resource African Languages using Various Machine Learning and Deep Learning Methods, Resampling, and HyperParameter Tuning (Keinan & Hacohen-Kerner, SemEval 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2023.semeval-1.49.pdf