Abstract
This paper describes the system we designed for our participation to SemEval2023 Task 12 Track 6 about Algerian dialect sentiment analysis. We propose a transformer language model approach combined with a lexicon mixing terms and emojis which is used in a post-processing filtering stage. The Algerian sentiment lexicons was extracted manually from tweets. We report on our experiments on Algerian dialect, where we compare the performance of marbert to the one of arabicbert and camelbert on the training and development datasets of Task 12. We also analyse the contribution of our post processing lexical filtering for sentiment analysis. Our system obtained a F1 score equal to 70%, ranking 9th among 30 participants.- Anthology ID:
- 2023.semeval-1.52
- Volume:
- Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 389–396
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2023.semeval-1.52/
- DOI:
- 10.18653/v1/2023.semeval-1.52
- Cite (ACL):
- Faiza Belbachir. 2023. Foul at SemEval-2023 Task 12: MARBERT Language model and lexical filtering for sentiments analysis of tweets in Algerian Arabic. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 389–396, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Foul at SemEval-2023 Task 12: MARBERT Language model and lexical filtering for sentiments analysis of tweets in Algerian Arabic (Belbachir, SemEval 2023)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2023.semeval-1.52.pdf