A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings

Aitor García Pablos, Montse Cuadros, German Rigau


Abstract
A key point in Sentiment Analysis is to determine the polarity of the sentiment implied by a certain word or expression. In basic Sentiment Analysis systems this sentiment polarity of the words is accounted and weighted in different ways to provide a degree of positivity/negativity. Currently words are also modelled as continuous dense vectors, known as word embeddings, which seem to encode interesting semantic knowledge. With regard to Sentiment Analysis, word embeddings are used as features to more complex supervised classification systems to obtain sentiment classifiers. In this paper we compare a set of existing sentiment lexicons and sentiment lexicon generation techniques. We also show a simple but effective technique to calculate a word polarity value for each word in a domain using existing continuous word embeddings generation methods. Further, we also show that word embeddings calculated on in-domain corpus capture the polarity better than the ones calculated on general-domain corpus.
Anthology ID:
L16-1009
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
54–60
Language:
URL:
https://aclanthology.org/L16-1009
DOI:
Bibkey:
Cite (ACL):
Aitor García Pablos, Montse Cuadros, and German Rigau. 2016. A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 54–60, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings (García Pablos et al., LREC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/L16-1009.pdf