Fatih Uzdilli


2017

pdf
A Twitter Corpus and Benchmark Resources for German Sentiment Analysis
Mark Cieliebak | Jan Milan Deriu | Dominic Egger | Fatih Uzdilli
Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media

In this paper we present SB10k, a new corpus for sentiment analysis with approx. 10,000 German tweets. We use this new corpus and two existing corpora to provide state-of-the-art benchmarks for sentiment analysis in German: we implemented a CNN (based on the winning system of SemEval-2016) and a feature-based SVM and compare their performance on all three corpora. For the CNN, we also created German word embeddings trained on 300M tweets. These word embeddings were then optimized for sentiment analysis using distant-supervised learning. The new corpus, the German word embeddings (plain and optimized), and source code to re-run the benchmarks are publicly available.

2016

pdf
SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision
Jan Deriu | Maurice Gonzenbach | Fatih Uzdilli | Aurelien Lucchi | Valeria De Luca | Martin Jaggi
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf
Swiss-Chocolate: Combining Flipout Regularization and Random Forests with Artificially Built Subsystems to Boost Text-Classification for Sentiment
Fatih Uzdilli | Martin Jaggi | Dominic Egger | Pascal Julmy | Leon Derczynski | Mark Cieliebak
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf
JOINT_FORCES: Unite Competing Sentiment Classifiers with Random Forest
Oliver Dürr | Fatih Uzdilli | Mark Cieliebak
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
Swiss-Chocolate: Sentiment Detection using Sparse SVMs and Part-Of-Speech n-Grams
Martin Jaggi | Fatih Uzdilli | Mark Cieliebak
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf
Meta-Classifiers Easily Improve Commercial Sentiment Detection Tools
Mark Cieliebak | Oliver Dürr | Fatih Uzdilli
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper, we analyze the quality of several commercial tools for sentiment detection. All tools are tested on nearly 30,000 short texts from various sources, such as tweets, news, reviews etc. The best commercial tools have average accuracy of 60%. We then apply machine learning techniques (Random Forests) to combine all tools, and show that this results in a meta-classifier that improves the overall performance significantly.