This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
AliOmrani
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
In subjective NLP tasks, where a single ground truth does not exist, the inclusion of diverse annotators becomes crucial as their unique perspectives significantly influence the annotations. In realistic scenarios, the annotation budget often becomes the main determinant of the number of perspectives (i.e., annotators) included in the data and subsequent modeling. We introduce a novel framework for annotation collection and modeling in subjective tasks that aims to minimize the annotation budget while maximizing the predictive performance for each annotator. Our framework has a two-stage design: first, we rely on a small set of annotators to build a multitask model, and second, we augment the model for a new perspective by strategically annotating a few samples per annotator. To test our framework at scale, we introduce and release a unique dataset, Moral Foundations Subjective Corpus, of 2000 Reddit posts annotated by 24 annotators for moral sentiment. We demonstrate that our framework surpasses the previous SOTA in capturing the annotators’ individual perspectives with as little as 25% of the original annotation budget on two datasets. Furthermore, our framework results in more equitable models, reducing the performance disparity among annotators.
Language usage is related to speaker age, gender, moral concerns, political ideology, and other attributes. Current state-of-the-art methods for predicting these attributes take a speaker’s utterances as input and provide a prediction per speaker attribute. Most of these approaches struggle to handle a large number of utterances per speaker. This difficulty is primarily due to the computational constraints of the models. Additionally, only a subset of speaker utterances may be relevant to specific attributes. In this paper, we formulate speaker attribute prediction as a Multiple Instance Learning (MIL) problem and propose RL-MIL, a novel approach based on Reinforcement Learning (RL) that effectively addresses both of these challenges. Our experiments demonstrate that our RL-based methodology consistently outperforms previous approaches across a range of related tasks: predicting speakers’ psychographics and demographics from social media posts, and political ideologies from transcribed speeches. We create synthetic datasets and investigate the behavior of RL-MIL systematically. Our results show the success of RL-MIL in improving speaker attribute prediction by learning to select relevant speaker utterances.
Detecting problematic content, such as hate speech, is a multifaceted and ever-changing task, influenced by social dynamics, user populations, diversity of sources, and evolving language. There has been significant efforts, both in academia and in industry, to develop annotated resources that capture various aspects of problematic content. Due to researchers’ diverse objectives, these annotations are often inconsistent and hence, reports of progress on the detection of problematic content are fragmented. This pattern is expected to persist unless we pool these resources, taking into account the dynamic nature of this issue. In this paper, we propose integrating the available resources, leveraging their dynamic nature to break this pattern, and introduce a continual learning framework and benchmark for problematic content detection. Our benchmark, comprising 84 related tasks, creates a novel measure of progress: prioritizing the adaptability of classifiers to evolving tasks over excelling in specific tasks. To ensure continuous relevance, our benchmark is designed for seamless integration of new tasks. Our results demonstrate that continual learning methods outperform static approaches by up to 17% and 4% AUC in capturing the evolving content and adapting to novel forms of problematic content
Existing bias mitigation methods require social-group-specific word pairs (e.g., “man” – “woman”) for each social attribute (e.g., gender), restricting the bias mitigation to only one specified social attribute. Further, this constraint renders such methods impractical and costly for mitigating bias in understudied and/or unmarked social groups. We propose that the Stereotype Content Model (SCM) — a theoretical framework developed in social psychology for understanding the content of stereotyping — can help debiasing efforts to become social-group-agnostic by capturing the underlying connection between bias and stereotypes. SCM proposes that the content of stereotypes map to two psychological dimensions of warmth and competence. Using only pairs of terms for these two dimensions (e.g., warmth: “genuine” – “fake”; competence: “smart” – “stupid”), we perform debiasing with established methods on both pre-trained word embeddings and large language models. We demonstrate that our social-group-agnostic, SCM-based debiasing technique performs comparably to group-specific debiasing on multiple bias benchmarks, but has theoretical and practical advantages over existing approaches.
Bias mitigation approaches reduce models’ dependence on sensitive features of data, such as social group tokens (SGTs), resulting in equal predictions across the sensitive features. In hate speech detection, however, equalizing model predictions may ignore important differences among targeted social groups, as hate speech can contain stereotypical language specific to each SGT. Here, to take the specific language about each SGT into account, we rely on counterfactual fairness and equalize predictions among counterfactuals, generated by changing the SGTs. Our method evaluates the similarity in sentence likelihoods (via pre-trained language models) among counterfactuals, to treat SGTs equally only within interchangeable contexts. By applying logit pairing to equalize outcomes on the restricted set of counterfactuals for each instance, we improve fairness metrics while preserving model performance on hate speech detection.