Learning Sentiment Lexicons in Spanish

Verónica Pérez-Rosas, Carmen Banea, Rada Mihalcea


Abstract
In this paper we present a framework to derive sentiment lexicons in a target language by using manually or automatically annotated data available in an electronic resource rich language, such as English. We show that bridging the language gap using the multilingual sense-level aligned WordNet structure allows us to generate a high accuracy (90%) polarity lexicon comprising 1,347 entries, and a disjoint lower accuracy (74%) one encompassing 2,496 words. By using an LSA-based vectorial expansion for the generated lexicons, we are able to obtain an average F-measure of 66% in the target language. This implies that the lexicons could be used to bootstrap higher-coverage lexicons using in-language resources.
Anthology ID:
L12-1645
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3077–3081
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/1081_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Verónica Pérez-Rosas, Carmen Banea, and Rada Mihalcea. 2012. Learning Sentiment Lexicons in Spanish. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3077–3081, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Learning Sentiment Lexicons in Spanish (Pérez-Rosas et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/1081_Paper.pdf