Abstract
Low-dimensional vector representations of social media users can benefit applications like recommendation systems and user attribute inference. Recent work has shown that user embeddings can be improved by combining different types of information, such as text and network data. We propose a data augmentation method that allows novel feature types to be used within off-the-shelf embedding models. Experimenting with the task of friend recommendation on a dataset of 5,019 Twitter users, we show that our approach can lead to substantial performance gains with the simple addition of network and geographic features.- Anthology ID:
- W17-4406
- Volume:
- Proceedings of the 3rd Workshop on Noisy User-generated Text
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Editors:
- Leon Derczynski, Wei Xu, Alan Ritter, Tim Baldwin
- Venue:
- WNUT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 45–49
- Language:
- URL:
- https://aclanthology.org/W17-4406
- DOI:
- 10.18653/v1/W17-4406
- Cite (ACL):
- Linzi Xing and Michael J. Paul. 2017. Incorporating Metadata into Content-Based User Embeddings. In Proceedings of the 3rd Workshop on Noisy User-generated Text, pages 45–49, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Incorporating Metadata into Content-Based User Embeddings (Xing & Paul, WNUT 2017)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/W17-4406.pdf