EmoTweet-28: A Fine-Grained Emotion Corpus for Sentiment Analysis

Jasy Suet Yan Liew, Howard R. Turtle, Elizabeth D. Liddy


Abstract
This paper describes EmoTweet-28, a carefully curated corpus of 15,553 tweets annotated with 28 emotion categories for the purpose of training and evaluating machine learning models for emotion classification. EmoTweet-28 is, to date, the largest tweet corpus annotated with fine-grained emotion categories. The corpus contains annotations for four facets of emotion: valence, arousal, emotion category and emotion cues. We first used small-scale content analysis to inductively identify a set of emotion categories that characterize the emotions expressed in microblog text. We then expanded the size of the corpus using crowdsourcing. The corpus encompasses a variety of examples including explicit and implicit expressions of emotions as well as tweets containing multiple emotions. EmoTweet-28 represents an important resource to advance the development and evaluation of more emotion-sensitive systems.
Anthology ID:
L16-1183
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1149–1156
Language:
URL:
https://aclanthology.org/L16-1183
DOI:
Bibkey:
Cite (ACL):
Jasy Suet Yan Liew, Howard R. Turtle, and Elizabeth D. Liddy. 2016. EmoTweet-28: A Fine-Grained Emotion Corpus for Sentiment Analysis. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1149–1156, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
EmoTweet-28: A Fine-Grained Emotion Corpus for Sentiment Analysis (Liew et al., LREC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/L16-1183.pdf