Rebecca Wilm
2022
Biographically Relevant Tweets – a New Dataset, Linguistic Analysis and Classification Experiments
Michael Wiegand
|
Rebecca Wilm
|
Katja Markert
Proceedings of the 29th International Conference on Computational Linguistics
We present a new dataset comprising tweets for the novel task of detecting biographically relevant utterances. Biographically relevant utterances are all those utterances that reveal some persistent and non-trivial information about the author of a tweet, e.g. habits, (dis)likes, family status, physical appearance, employment information, health issues etc. Unlike previous research we do not restrict biographical relevance to a small fixed set of pre-defined relations. Next to classification experiments employing state-of-the-art classifiers to establish strong baselines for future work, we carry out a linguistic analysis that compares the predictiveness of various high-level features. We also show that the task is different from established tasks, such as aspectual classification or sentiment analysis.
2018
Distinguishing affixoid formations from compounds
Josef Ruppenhofer
|
Michael Wiegand
|
Rebecca Wilm
|
Katja Markert
Proceedings of the 27th International Conference on Computational Linguistics
We study German affixoids, a type of morpheme in between affixes and free stems. Several properties have been associated with them – increased productivity; a bleached semantics, which is often evaluative and/or intensifying and thus of relevance to sentiment analysis; and the existence of a free morpheme counterpart – but not been validated empirically. In experiments on a new data set that we make available, we put these key assumptions from the morphological literature to the test and show that despite the fact that affixoids generate many low-frequency formations, we can classify these as affixoid or non-affixoid instances with a best F1-score of 74%.
Search