Lyndon Nixon
2017
Character-based Neural Embeddings for Tweet Clustering
Svitlana Vakulenko | Lyndon Nixon | Mihai Lupu
Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media
Svitlana Vakulenko | Lyndon Nixon | Mihai Lupu
Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media
In this paper we show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line: https://github.com/vendi12/tweet2vec_clustering.