Javier Fernandez-Cruz
2020
Design and Evaluation of SentiEcon: a fine-grained Economic/Financial Sentiment Lexicon from a Corpus of Business News
Antonio Moreno-Ortiz
|
Javier Fernandez-Cruz
|
Chantal Pérez Chantal Hernández
Proceedings of the Twelfth Language Resources and Evaluation Conference
In this paper we present, describe, and evaluate SentiEcon, a large, comprehensive, domain-specific computational lexicon designed for sentiment analysis applications, for which we compiled our own corpus of online business news. SentiEcon was created as a plug-in lexicon for the sentiment analysis tool Lingmotif, and thus it follows its data structure requirements and presupposes the availability of a general-language core sentiment lexicon that covers non-specific sentiment-carrying terms and phrases. It contains 6,470 entries, both single and multi-word expressions, each with tags denoting their semantic orientation and intensity. We evaluate SentiEcon’s performance by comparing results in a sentence classification task using exclusively sentiment words as features. This sentence dataset was extracted from business news texts, and included certain key words known to recurrently convey strong semantic orientation, such as “debt”, “inflation” or “markets”. The results show that performance is significantly improved when adding SentiEcon to the general-language sentiment lexicon.