Annika Marie Schoene


2024

pdf
MEANT: Multimodal Encoder for Antecedent Information
Benjamin Irving | Annika Marie Schoene
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The stock market provides a rich well of information that can be split across modalities, making it an ideal candidate for multimodal evaluation. Multimodal data plays an increasingly important role in the development of machine learning and has shown to positively impact performance. But information can do more than exist across modes— it can exist across time. How should we attend to temporal data that consists of multiple information types? This work introduces (i) the MEANT model, a Multimodal Encoder for Antecedent information and (ii) a new dataset called TempStock, which consists of price, Tweets, and graphical data with over a million Tweets from all of the companies in the S&P 500 Index. We find that MEANT improves performance on existing baselines by over 15%, and that the textual information affects performance far more than the visual information on our time-dependent task from our ablation study. The code and dataset will be made available upon publication.

pdf
Is it safe to machine translate suicide-related language from English to Galician?
John E. Ortega | Annika Marie Schoene
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1

pdf
All Models are Wrong, But Some are Deadly: Inconsistencies in Emotion Detection in Suicide-related Tweets
Annika Marie Schoene | Resmi Ramachandranpillai | Tomo Lazovich | Ricardo A. Baeza-Yates
Proceedings of the Third Workshop on NLP for Positive Impact

Recent work in psychology has shown that people who experience mental health challenges are more likely to express their thoughts, emotions, and feelings on social media than share it with a clinical professional. Distinguishing suicide-related content, such as suicide mentioned in a humorous context, from genuine expressions of suicidal ideation is essential to better understanding context and risk. In this paper, we give a first insight and analysis into the differences between emotion labels annotated by humans and labels predicted by three fine-tuned language models (LMs) for suicide-related content. We find that (i) there is little agreement between LMs and humans for emotion labels of suicide-related Tweets and (ii) individual LMs predict similar emotion labels for all suicide-related categories. Our findings lead us to question the credibility and usefulness of such methods in high-risk scenarios such as suicide ideation detection.

2022

pdf
RELATE: Generating a linguistically inspired Knowledge Graph for fine-grained emotion classification
Annika Marie Schoene | Nina Dethlefs | Sophia Ananiadou
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Several existing resources are available for sentiment analysis (SA) tasks that are used for learning sentiment specific embedding (SSE) representations. These resources are either large, common-sense knowledge graphs (KG) that cover a limited amount of polarities/emotions or they are smaller in size (e.g.: lexicons), which require costly human annotation and cover fine-grained emotions. Therefore using knowledge resources to learn SSE representations is either limited by the low coverage of polarities/emotions or the overall size of a resource. In this paper, we first introduce a new directed KG called ‘RELATE’, which is built to overcome both the issue of low coverage of emotions and the issue of scalability. RELATE is the first KG of its size to cover Ekman’s six basic emotions that are directed towards entities. It is based on linguistic rules to incorporate the benefit of semantics without relying on costly human annotation. The performance of ‘RELATE’ is evaluated by learning SSE representations using a Graph Convolutional Neural Network (GCN).

2016

pdf
Automatic Identification of Suicide Notes from Linguistic and Sentiment Features
Annika Marie Schoene | Nina Dethlefs
Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities