Sonja Schmer-Galunder

Also published as: Sonja Schmer-galunder


2024

pdf
Recognizing Value Resonance with Resonance-Tuned RoBERTa Task Definition, Experimental Validation, and Robust Modeling
Noam K. Benkler | Scott Friedman | Sonja Schmer-Galunder | Drisana Marissa Mosaphir | Robert P. Goldman | Ruta Wheelock | Vasanth Sarathy | Pavan Kantharaju | Matthew D. McLure
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Understanding the implicit values and beliefs of diverse groups and cultures using qualitative texts – such as long-form narratives – and domain-expert interviews is a fundamental goal of social anthropology. This paper builds upon a 2022 study that introduced the NLP task of Recognizing Value Resonance (RVR) for gauging perspective – positive, negative, or neutral – on implicit values and beliefs in textual pairs. This study included a novel hand-annotated dataset, the World Values Corpus (WVC), designed to simulate the task of RVR, and a transformer-based model, Resonance-Tuned RoBERTa, designed to model the task. We extend existing work by refining the task definition and releasing the World Values Corpus (WVC) dataset. We further conduct several validation experiments designed to robustly evaluate the need for task specific modeling, even in the world of LLMs. Finally, we present two additional Resonance-Tuned models trained over extended RVR datasets, designed to improve RVR model versatility and robustness. Our results demonstrate that the Resonance-Tuned models outperform top-performing Recognizing Textual Entailment (RTE) models in recognizing value resonance as well as zero-shot GPT-3.5 under several different prompt structures, emphasizing its practical applicability. Our findings highlight the potential of RVR in capturing cultural values within texts and the importance of task-specific modeling.

2022

pdf
Extracting Associations of Intersectional Identities with Discourse about Institution from Nigeria
Pavan Kantharaju | Sonja Schmer-galunder
Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS)

Word embedding models have been used in prior work to extract associations of intersectional identities within discourse concerning institutions of power, but restricted its focus on narratives of the nineteenth-century U.S. south. This paper leverages this prior work and introduces an initial study on the association of intersected identities with discourse concerning social institutions within social media from Nigeria. Specifically, we use word embedding models trained on tweets from Nigeria and extract associations of intersected social identities with institutions (e.g., domestic, culture, etc.) to provide insight into the alignment of identities with institutions. Our initial experiments indicate that identities at the intersection of gender and economic status groups have significant associations with discourse about the economic, political, and domestic institutions.

pdf
Towards a Multi-Entity Aspect-Based Sentiment Analysis for Characterizing Directed Social Regard in Online Messaging
Joan Zheng | Scott Friedman | Sonja Schmer-galunder | Ian Magnusson | Ruta Wheelock | Jeremy Gottlieb | Diana Gomez | Christopher Miller
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

Online messaging is dynamic, influential, and highly contextual, and a single post may contain contrasting sentiments towards multiple entities, such as dehumanizing one actor while empathizing with another in the same message. These complexities are important to capture for understanding the systematic abuse voiced within an online community, or for determining whether individuals are advocating for abuse, opposing abuse, or simply reporting abuse. In this work, we describe a formulation of directed social regard (DSR) as a problem of multi-entity aspect-based sentiment analysis (ME-ABSA), which models the degree of intensity of multiple sentiments that are associated with entities described by a text document. Our DSR schema is informed by Bandura’s psychosocial theory of moral disengagement and by recent work in ABSA. We present a dataset of over 2,900 posts and sentences, comprising over 24,000 entities annotated for DSR over nine psychosocial dimensions by three annotators. We present a novel transformer-based ME-ABSA model for DSR, achieving favorable preliminary results on this dataset.

pdf
From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains
Brodie Mather | Bonnie Dorr | Adam Dalton | William de Beaumont | Owen Rambow | Sonja Schmer-Galunder
Findings of the Association for Computational Linguistics: ACL 2022

We present a generalized paradigm for adaptation of propositional analysis (predicate-argument pairs) to new tasks and domains. We leverage an analogy between stances (belief-driven sentiment) and concerns (topical issues with moral dimensions/endorsements) to produce an explanatory representation. A key contribution is the combination of semi-automatic resource building for extraction of domain-dependent concern types (with 2-4 hours of human labor per domain) and an entirely automatic procedure for extraction of domain-independent moral dimensions and endorsement values. Prudent (automatic) selection of terms from propositional structures for lexical expansion (via semantic similarity) produces new moral dimension lexicons at three levels of granularity beyond a strong baseline lexicon. We develop a ground truth (GT) based on expert annotators and compare our concern detection output to GT, to yield 231% improvement in recall over baseline, with only a 10% loss in precision. F1 yields 66% improvement over baseline and 97.8% of human performance. Our lexically based approach yields large savings over approaches that employ costly human labor and model building. We provide to the community a newly expanded moral dimension/value lexicon, annotation guidelines, and GT.

2019

pdf
Relating Word Embedding Gender Biases to Gender Gaps: A Cross-Cultural Analysis
Scott Friedman | Sonja Schmer-Galunder | Anthony Chen | Jeffrey Rye
Proceedings of the First Workshop on Gender Bias in Natural Language Processing

Modern models for common NLP tasks often employ machine learning techniques and train on journalistic, social media, or other culturally-derived text. These have recently been scrutinized for racial and gender biases, rooting from inherent bias in their training text. These biases are often sub-optimal and recent work poses methods to rectify them; however, these biases may shed light on actual racial or gender gaps in the culture(s) that produced the training text, thereby helping us understand cultural context through big data. This paper presents an approach for quantifying gender bias in word embeddings, and then using them to characterize statistical gender gaps in education, politics, economics, and health. We validate these metrics on 2018 Twitter data spanning 51 U.S. regions and 99 countries. We correlate state and country word embedding biases with 18 international and 5 U.S.-based statistical gender gaps, characterizing regularities and predictive strength.