Abstract
Tweets are short messages that often include specialized language such as hashtags and emojis. In this paper, we present a simple strategy to process emojis: replace them with their natural language description and use pretrained word embeddings as normally done with standard words. We show that this strategy is more effective than using pretrained emoji embeddings for tweet classification. Specifically, we obtain new state-of-the-art results in irony detection and sentiment analysis despite our neural network is simpler than previous proposals.- Anthology ID:
- N19-1214
- Volume:
- Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Jill Burstein, Christy Doran, Thamar Solorio
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2096–2101
- Language:
- URL:
- https://aclanthology.org/N19-1214
- DOI:
- 10.18653/v1/N19-1214
- Cite (ACL):
- Abhishek Singh, Eduardo Blanco, and Wei Jin. 2019. Incorporating Emoji Descriptions Improves Tweet Classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2096–2101, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- Incorporating Emoji Descriptions Improves Tweet Classification (Singh et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/N19-1214.pdf