Veronika Solopova


2021

pdf
A German Corpus of Reflective Sentences
Veronika Solopova | Oana-Iuliana Popescu | Margarita Chikobava | Ralf Romeike | Tim Landgraf | Christoph Benzmüller
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

Reflection about a learning process is beneficial to students in higher education (Bub-nys, 2019). The importance of machine understanding of reflective texts grows as applications supporting students become more widespread. Nevertheless, due to the sensitive content, there is no public corpus available yet for the classification of text reflectiveness. We provide the first open-access corpus of reflective student essays in German. We collected essays from three different disciplines (Software Development, Ethics of Artificial Intelligence, and Teacher Training). We annotated the corpus at sentence level with binary reflective/non-reflective labels, using an iterative annotation process with linguistic and didactic specialists, mapping the reflective components found in the data to existing schemes and complementing them. We propose and evaluate linguistic features of reflectiveness and analyse their distribution within the resulted sentences according to their labels. Our contribution constitutes the first open-access corpus to help the community towards a unified approach for reflection detection.

2020

pdf
Adapting Coreference Resolution to Twitter Conversations
Berfin Aktaş | Veronika Solopova | Annalena Kohnert | Manfred Stede
Findings of the Association for Computational Linguistics: EMNLP 2020

The performance of standard coreference resolution is known to drop significantly on Twitter texts. We improve the performance of the (Lee et al., 2018) system, which is originally trained on OntoNotes, by retraining on manually-annotated Twitter conversation data. Further experiments by combining different portions of OntoNotes with Twitter data show that selecting text genres for the training data can beat the mere maximization of training data amount. In addition, we inspect several phenomena such as the role of deictic pronouns in conversational data, and present additional results for variant settings. Our best configuration improves the performance of the”out of the box” system by 21.6%.