This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
BesmaBenaziz
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
In this paper, we analyze the impact of the weighted concatenation of TF-IDF features for the Arabic Dialect Identification task while we participated in the NADI2021 shared task. This study is performed for two subtasks: subtask 1.1 (country-level MSA) and subtask 1.2 (country-level DA) identification. The classifiers supporting our comparative study are Linear Support Vector Classification (LSVC), Linear Regression (LR), Perceptron, Stochastic Gradient Descent (SGD), Passive Aggressive (PA), Complement Naive Bayes (CNB), MutliLayer Perceptron (MLP), and RidgeClassifier. In the evaluation phase, our system gives F1 scores of 14.87% and 21.49%, for country-level MSA and DA identification respectively, which is very close to the average F1 scores achieved by the submitted systems and recorded for both subtasks (18.70% and 24.23%).
This paper describes our approach to detecting Sentiment and Sarcasm for Arabic in the ArSarcasm 2021 shared task. Data preprocessing is a crucial task for a successful learning, that is why we applied a set of preprocessing steps to the dataset before training two classifiers, namely Linear Support Vector Classifier (LSVC) and Bidirectional Long Short Term Memory (BiLSTM). The findings show that despite the simplicity of the proposed approach, using the LSVC model with a normalizing Arabic (NA) preprocessing and the BiLSTM architecture with an Embedding layer as input have yielded an encouraging F1score of 33.71% and 57.80% for sarcasm and sentiment detection, respectively.