Assessing Emoji Use in Modern Text Processing Tools

Abu Awal Md Shoeb, Gerard de Melo


Abstract
Emojis have become ubiquitous in digital communication, due to their visual appeal as well as their ability to vividly convey human emotion, among other factors. This also leads to an increased need for systems and tools to operate on text containing emojis. In this study, we assess this support by considering test sets of tweets with emojis, based on which we perform a series of experiments investigating the ability of prominent NLP and text processing tools to adequately process them. In particular, we consider tokenization, part-of-speech tagging, dependency parsing, as well as sentiment analysis. Our findings show that many systems still have notable shortcomings when operating on text containing emojis.
Anthology ID:
2021.acl-long.110
Original:
2021.acl-long.110v1
Version 2:
2021.acl-long.110v2
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1379–1388
Language:
URL:
https://aclanthology.org/2021.acl-long.110
DOI:
10.18653/v1/2021.acl-long.110
Bibkey:
Cite (ACL):
Abu Awal Md Shoeb and Gerard de Melo. 2021. Assessing Emoji Use in Modern Text Processing Tools. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1379–1388, Online. Association for Computational Linguistics.
Cite (Informal):
Assessing Emoji Use in Modern Text Processing Tools (Shoeb & de Melo, ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.acl-long.110.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2021.acl-long.110.mp4