Naama Rivlin-Angert
2025
The Enemy from Within: A Study of Political Delegitimization Discourse in Israeli Political Speech
Naama Rivlin-Angert
|
Guy Mor-Lan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
We present the first large-scale computational study of political delegitimization discourse (PDD), defined as symbolic attacks on the normative validity of political entities. We curate and manually annotate a novel Hebrew-language corpus of 10,410 sentences drawn from parliamentary speeches (1993-2023), Facebook posts, and leading news outlets (2018-2021), of which 1,812 instances (17.4%) exhibit PDD and 642 carry additional annotations for intensity, incivility, target type, and affective framing. We introduce a two-stage classification pipeline, and benchmark finetuned encoder models and decoder LLMs. Our best model (DictaLM 2.0) attains an F1 of 0.74 for binary PDD detection and a macro-F1 of 0.67 for classification of delegitimization characteristics. Applying this classifier to longitudinal and cross-platform data, we see a marked rise in PDD over three decades, higher prevalence on social media versus parliamentary debate, greater use by male politicians than by their female counterparts, and stronger tendencies among right-leaning actors, with pronounced spikes during election campaigns and major political events. Our findings demonstrate the feasibility and value of automated PDD analysis for analyzing democratic discourse.
HebID: Detecting Social Identities in Hebrew-language Political Text
Guy Mor-Lan
|
Naama Rivlin-Angert
|
Yael R. Kaplan
|
Tamir Sheafer
|
Shaul R. Shenhav
Findings of the Association for Computational Linguistics: EMNLP 2025
Political language is deeply intertwined with social identities. While social identities are often shaped by specific cultural contexts, existing NLP datasets are predominantly English-centric and focus on coarse-grained identity categories. We introduce HebID, the first multilabel Hebrew corpus for social identity detection. The corpus contains 5,536 sentences from Israeli politicians’ Facebook posts (Dec 2018-Apr 2021), with each sentence manually annotated for twelve nuanced social identities (e.g., Rightist, Ultra-Orthodox, Socially-oriented) selected based on their salience in national survey data. We benchmark multilabel and single-label encoders alongside 2B-9B-parameter decoder LLMs, finding that Hebrew-tuned LLMs provide the best results (macro-F1 = 0.74). We apply our classifier to politicians’ Facebook posts and parliamentary speeches, evaluating differences in popularity, temporal trends, clustering patterns, and gender-related variations in identity expression. We utilize identity choices from a national public survey, comparing the identities portrayed in elite discourse with those prioritized by the public. HebID provides a comprehensive foundation for studying social identities in Hebrew and can serve as a model for similar research in other non-English political contexts