A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks

Nastaran Babanejad, Ameeta Agrawal, Aijun An, Manos Papagelis


Abstract
Affective tasks such as sentiment analysis, emotion classification, and sarcasm detection have been popular in recent years due to an abundance of user-generated data, accurate computational linguistic models, and a broad range of relevant applications in various domains. At the same time, many studies have highlighted the importance of text preprocessing, as an integral step to any natural language processing prediction model and downstream task. While preprocessing in affective systems is well-studied, preprocessing in word vector-based models applied to affective systems, is not. To address this limitation, we conduct a comprehensive analysis of the role of preprocessing techniques in affective analysis based on word vector models. Our analysis is the first of its kind and provides useful insights of the importance of each preprocessing technique when applied at the training phase, commonly ignored in pretrained word vector models, and/or at the downstream task phase.
Anthology ID:
2020.acl-main.514
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5799–5810
Language:
URL:
https://aclanthology.org/2020.acl-main.514
DOI:
10.18653/v1/2020.acl-main.514
Bibkey:
Cite (ACL):
Nastaran Babanejad, Ameeta Agrawal, Aijun An, and Manos Papagelis. 2020. A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5799–5810, Online. Association for Computational Linguistics.
Cite (Informal):
A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks (Babanejad et al., ACL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.acl-main.514.pdf
Video:
 http://slideslive.com/38929338