Abstract
In this paper, we describe our submissions to SemEval-2022 contest. We tackled subtask 6-A - “iSarcasmEval: Intended Sarcasm Detection In English and Arabic – Binary Classification”. We developed different models for two languages: English and Arabic. We applied 4 supervised machine learning methods, 6 preprocessing methods for English and 3 for Arabic, and 3 oversampling methods. Our best submitted model for the English test dataset was a SVC model that balanced the dataset using SMOTE and removed stop words. For the Arabic test dataset our best submitted model was a SVC model that preprocessed removed longation.- Anthology ID:
- 2022.semeval-1.145
- Volume:
- Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1031–1038
- Language:
- URL:
- https://aclanthology.org/2022.semeval-1.145
- DOI:
- 10.18653/v1/2022.semeval-1.145
- Cite (ACL):
- Yaakov HaCohen-Kerner, Matan Fchima, and Ilan Meyrowitsch. 2022. JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1031–1038, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams (HaCohen-Kerner et al., SemEval 2022)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2022.semeval-1.145.pdf