Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects

Hala Mulki; Hatem Haddad; Mourad Gridach; Ismail Babaoğlu

doi:10.18653/v1/W19-4604

Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects

Hala Mulki, Hatem Haddad, Mourad Gridach, Ismail Babaoğlu

Abstract

Arabic sentiment analysis models have employed compositional embedding features to represent the Arabic dialectal content. These embeddings are usually composed via ordered, syntax-aware composition functions and learned within deep neural frameworks. With the free word order and the varying syntax nature across the different Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others. Here we present syntax-ignorant n-gram embeddings to be used in sentiment analysis of several Arabic dialects. The proposed embeddings were composed and learned using an unordered composition function and a shallow neural model. Five datasets of different dialects were used to evaluate the produced embeddings in the sentiment analysis task. The obtained results revealed that, our syntax-ignorant embeddings could outperform word2vec model and doc2vec both variant models in addition to hand-crafted system baselines, while a competent performance was noticed towards baseline systems that adopted more complicated neural architectures.

Anthology ID:: W19-4604
Volume:: Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
Venue:: WANLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30–39
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/W19-4604/
DOI:: 10.18653/v1/W19-4604
Bibkey:
Cite (ACL):: Hala Mulki, Hatem Haddad, Mourad Gridach, and Ismail Babaoğlu. 2019. Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 30–39, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects (Mulki et al., WANLP 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/W19-4604.pdf
Data: ASTD, TSAC

PDF Cite Search Fix data