Christoph Schommer


2020

pdf
Component Analysis of Adjectives in Luxembourgish for Detecting Sentiments
Joshgun Sirajzade | Daniela Gierschek | Christoph Schommer
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

The aim of this paper is to investigate the role of Luxembourgish adjectives in expressing sentiments in user comments written at the web presence of rtl.lu (RTL is the abbreviation for Radio Television Letzebuerg). Alongside many textual features or representations,adjectives could be used in order to detect sentiment, even on a sentence or comment level. In fact, they are also by themselves one of the best ways to describe a sentiment, despite the fact that other word classes such as nouns, verbs, adverbs or conjunctions can also be utilized for this purpose. The empirical part of this study focuses on a list of adjectives that were extracted from an annotated corpus. The corpus contains the part of speech tags of individual words and sentiment annotation on the adjective, sentence and comment level. Suffixes of Luxembourgish adjectives like -esch, -eg, -lech, -al, -el, -iv, -ent, -los, -barand the prefixon- were explicitly investigated, especially by paying attention to their role in regards to building a model by applying classical machine learning techniques. We also considered the interaction of adjectives with other grammatical means, especially other part of speeches, e.g. negations, which can completely reverse the meaning, thus the sentiment of an utterance.

pdf
An Annotation Framework for Luxembourgish Sentiment Analysis
Joshgun Sirajzade | Daniela Gierschek | Christoph Schommer
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

The aim of this paper is to present a framework developed for crowdsourcing sentiment annotation for the low-resource language Luxembourgish. Our tool is easily accessible through a web interface and facilitates sentence-level annotation of several annotators in parallel. In the heart of our framework is an XML database, which serves as central part linking several components. The corpus in the database consists of news articles and user comments. One of the components is LuNa, a tool for linguistic preprocessing of the data set. It tokenizes the text, splits it into sentences and assigns POS-tags to the tokens. After that, the preprocessed text is stored in XML format into the database. The Sentiment Annotation Tool, which is a browser-based tool, then enables the annotation of split sentences from the database. The Sentiment Engine, a separate module, is trained with this material in order to annotate the whole data set and analyze the sentiment of the comments over time and in relationship to the news articles. The gained knowledge can again be used to improve the sentiment classification on the one hand and on the other hand to understand the sentiment phenomenon from the linguistic point of view.

2019

pdf
A Personalized Sentiment Model with Textual and Contextual Information
Siwen Guo | Sviatlana Höhn | Christoph Schommer
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

In this paper, we look beyond the traditional population-level sentiment modeling and consider the individuality in a person’s expressions by discovering both textual and contextual information. In particular, we construct a hierarchical neural network that leverages valuable information from a person’s past expressions, and offer a better understanding of the sentiment from the expresser’s perspective. Additionally, we investigate how a person’s sentiment changes over time so that recent incidents or opinions may have more effect on the person’s current sentiment than the old ones. Psychological studies have also shown that individual variation exists in how easily people change their sentiments. In order to model such traits, we develop a modified attention mechanism with Hawkes process applied on top of a recurrent network for a user-specific design. Implemented with automatically labeled Twitter data, the proposed model has shown positive results employing different input formulations for representing the concerned information.