Multi-resolution Annotations for Emoji Prediction

Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi


Abstract
Emojis are able to express various linguistic components, including emotions, sentiments, events, etc. Predicting the proper emojis associated with text provides a way to summarize the text accurately, and it has been proven to be a good auxiliary task to many Natural Language Understanding (NLU) tasks. Labels in existing emoji prediction datasets are all passage-based and are usually under the multi-class classification setting. However, in many cases, one single emoji cannot fully cover the theme of a piece of text. It is thus useful to infer the part of text related to each emoji. The lack of multi-label and aspect-level emoji prediction datasets is one of the bottlenecks for this task. This paper annotates an emoji prediction dataset with passage-level multi-class/multi-label, and aspect-level multi-class annotations. We also present a novel annotation method with which we generate the aspect-level annotations. The annotations are generated heuristically, taking advantage of the self-attention mechanism in Transformer networks. We validate the annotations both automatically and manually to ensure their quality. We also benchmark the dataset with a pre-trained BERT model.
Anthology ID:
2020.emnlp-main.542
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6684–6694
Language:
URL:
https://aclanthology.org/2020.emnlp-main.542
DOI:
10.18653/v1/2020.emnlp-main.542
Bibkey:
Cite (ACL):
Weicheng Ma, Ruibo Liu, Lili Wang, and Soroush Vosoughi. 2020. Multi-resolution Annotations for Emoji Prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6684–6694, Online. Association for Computational Linguistics.
Cite (Informal):
Multi-resolution Annotations for Emoji Prediction (Ma et al., EMNLP 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.emnlp-main.542.pdf
Video:
 https://slideslive.com/38939081