DuluthNLP at SemEval-2023 Task 12: AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset

Samuel Akrah; Ted Pedersen

doi:10.18653/v1/2023.semeval-1.236

DuluthNLP at SemEval-2023 Task 12: AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset

Abstract

This paper describes the DuluthNLP system that participated in Task 12 of SemEval-2023 on AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset. Given a set of tweets, the task requires participating systems to classify each tweet as negative, positive or neutral. We evaluate a range of monolingual and multilingual pretrained models on the Twi language dataset, one among the 14 African languages included in the SemEval task. We introduce TwiBERT, a new pretrained model trained from scratch. We show that TwiBERT, along with mBERT, generally perform best when trained on the Twi dataset, achieving an F1 score of 64.29% on the official evaluation test data, which ranks 14 out of 30 of the total submissions for Track 10. The TwiBERT model is released at https://huggingface.co/sakrah/TwiBERT

Anthology ID:: 2023.semeval-1.236
Volume:: Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1697–1701
Language:
URL:: https://aclanthology.org/2023.semeval-1.236
DOI:: 10.18653/v1/2023.semeval-1.236
Bibkey:
Cite (ACL):: Samuel Akrah and Ted Pedersen. 2023. DuluthNLP at SemEval-2023 Task 12: AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 1697–1701, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: DuluthNLP at SemEval-2023 Task 12: AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset (Akrah & Pedersen, SemEval 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-bitext-workshop/2023.semeval-1.236.pdf

PDF Search