A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks
Nastaran Babanejad, Ameeta Agrawal, Aijun An, Manos Papagelis
Abstract
Affective tasks such as sentiment analysis, emotion classification, and sarcasm detection have been popular in recent years due to an abundance of user-generated data, accurate computational linguistic models, and a broad range of relevant applications in various domains. At the same time, many studies have highlighted the importance of text preprocessing, as an integral step to any natural language processing prediction model and downstream task. While preprocessing in affective systems is well-studied, preprocessing in word vector-based models applied to affective systems, is not. To address this limitation, we conduct a comprehensive analysis of the role of preprocessing techniques in affective analysis based on word vector models. Our analysis is the first of its kind and provides useful insights of the importance of each preprocessing technique when applied at the training phase, commonly ignored in pretrained word vector models, and/or at the downstream task phase.- Anthology ID:
- 2020.acl-main.514
- Volume:
- Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5799–5810
- Language:
- URL:
- https://aclanthology.org/2020.acl-main.514
- DOI:
- 10.18653/v1/2020.acl-main.514
- Cite (ACL):
- Nastaran Babanejad, Ameeta Agrawal, Aijun An, and Manos Papagelis. 2020. A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5799–5810, Online. Association for Computational Linguistics.
- Cite (Informal):
- A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks (Babanejad et al., ACL 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.acl-main.514.pdf