Matan Fchima
2022
JCT at SemEval-2022 Task 4-A: Patronism Detection in Posts Written in English using Preprocessing Methods and various Machine Leaerning Methods
Yaakov HaCohen-Kerner
|
Ilan Meyrowitsch
|
Matan Fchima
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
In this paper, we describe our submissions to SemEval-2022 subtask 4-A - “Patronizing and Condescending Language Detection: Binary Classification”. We developed different models for this subtask. We applied 11 supervised machine learning methods and 9 preprocessing methods. Our best submission was a model we built with BertForSequenceClassification. Our experiments indicate that pre-processing stage is a must for a successful model. The dataset for Subtask 1 is highly imbalanced dataset. The f1-scores on the oversampled imbalanced training dataset were higher the results on the original training dataset.
JCT at SemEval-2022 Task 6-A: Sarcasm Detection in Tweets Written in English and Arabic using Preprocessing Methods and Word N-grams
Yaakov HaCohen-Kerner
|
Matan Fchima
|
Ilan Meyrowitsch
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
In this paper, we describe our submissions to SemEval-2022 contest. We tackled subtask 6-A - “iSarcasmEval: Intended Sarcasm Detection In English and Arabic – Binary Classification”. We developed different models for two languages: English and Arabic. We applied 4 supervised machine learning methods, 6 preprocessing methods for English and 3 for Arabic, and 3 oversampling methods. Our best submitted model for the English test dataset was a SVC model that balanced the dataset using SMOTE and removed stop words. For the Arabic test dataset our best submitted model was a SVC model that preprocessed removed longation.
Search