Ladislav Lenc


Czech Text Document Corpus v 2.0
Pavel Král | Ladislav Lenc
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

UWB at SemEval-2018 Task 1: Emotion Intensity Detection in Tweets
Pavel Přibáň | Tomáš Hercig | Ladislav Lenc
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes our system created for the SemEval-2018 Task 1: Affect in Tweets (AIT-2018). We participated in both the regression and the ordinal classification subtasks for emotion intensity detection in English, Arabic, and Spanish. For the regression subtask we use the AffectiveTweets system with added features using various word embeddings, lexicons, and LDA. For the ordinal classification we additionally use our Brainy system with features using parse tree, POS tags, and morphological features. The most beneficial features apart from word and character n-grams include word embeddings, POS count and morphological features.


The Impact of Figurative Language on Sentiment Analysis
Tomáš Hercig | Ladislav Lenc
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

Figurative language such as irony, sarcasm, and metaphor is considered a significant challenge in sentiment analysis. These figurative devices can sculpt the affect of an utterance and test the limits of sentiment analysis of supposedly literal texts. We explore the effect of figurative language on sentiment analysis. We incorporate the figurative language indicators into the sentiment analysis process and compare the results with and without the additional information about them. We evaluate on the SemEval-2015 Task 11 data and outperform the first team with our convolutional neural network model and additional training data in terms of mean squared error and we follow closely behind the first place in terms of cosine similarity.

Word Embeddings for Multi-label Document Classification
Ladislav Lenc | Pavel Král
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

In this paper, we analyze and evaluate word embeddings for representation of longer texts in the multi-label classification scenario. The embeddings are used in three convolutional neural network topologies. The experiments are realized on the Czech ČTK and English Reuters-21578 standard corpora. We compare the results of word2vec static and trainable embeddings with randomly initialized word vectors. We conclude that initialization does not play an important role for classification. However, learning of word vectors is crucial to obtain good results.


UWB at SemEval-2016 Task 7: Novel Method for Automatic Sentiment Intensity Determination
Ladislav Lenc | Pavel Král | Václav Rajtmajer
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)