Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects

Hala Mulki, Hatem Haddad, Mourad Gridach, Ismail Babaoğlu

[How to correct problems with metadata yourself]


Abstract
Arabic sentiment analysis models have employed compositional embedding features to represent the Arabic dialectal content. These embeddings are usually composed via ordered, syntax-aware composition functions and learned within deep neural frameworks. With the free word order and the varying syntax nature across the different Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others. Here we present syntax-ignorant n-gram embeddings to be used in sentiment analysis of several Arabic dialects. The proposed embeddings were composed and learned using an unordered composition function and a shallow neural model. Five datasets of different dialects were used to evaluate the produced embeddings in the sentiment analysis task. The obtained results revealed that, our syntax-ignorant embeddings could outperform word2vec model and doc2vec both variant models in addition to hand-crafted system baselines, while a competent performance was noticed towards baseline systems that adopted more complicated neural architectures.
Anthology ID:
W19-4604
Volume:
Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–39
Language:
URL:
https://aclanthology.org/W19-4604
DOI:
10.18653/v1/W19-4604
Bibkey:
Cite (ACL):
Hala Mulki, Hatem Haddad, Mourad Gridach, and Ismail Babaoğlu. 2019. Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 30–39, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects (Mulki et al., WANLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/W19-4604.pdf
Data
ASTDTSAC