Abstract
Many social media classification tasks analyze the content of a message, but do not consider the context of the message. For example, in tweet stance classification – where a tweet is categorized according to a viewpoint it espouses – the expressed viewpoint depends on latent beliefs held by the user. In this paper we investigate whether incorporating knowledge about the author can improve tweet stance classification. Furthermore, since author information and embeddings are often unavailable for labeled training examples, we propose a semi-supervised pretraining method to predict user embeddings. Although the neural stance classifiers we learn are often outperformed by a baseline SVM, author embedding pre-training yields improvements over a non-pre-trained neural network on four out of five domains in the SemEval 2016 6A tweet stance classification task. In a tweet gun control stance classification dataset, improvements from pre-training are only apparent when training data is limited.- Anthology ID:
- W18-6124
- Volume:
- Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
- Month:
- November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
- Venue:
- WNUT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 184–194
- Language:
- URL:
- https://aclanthology.org/W18-6124
- DOI:
- 10.18653/v1/W18-6124
- Cite (ACL):
- Adrian Benton and Mark Dredze. 2018. Using Author Embeddings to Improve Tweet Stance Classification. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text, pages 184–194, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Using Author Embeddings to Improve Tweet Stance Classification (Benton & Dredze, WNUT 2018)
- PDF:
- https://preview.aclanthology.org/fix-volume-bibkeys/W18-6124.pdf