Farzan Karimi-Malekabadi
Also published as: Farzan Karimi Malekabadi
2026
The Moral Foundations Reddit Corpus
Jackson P. Trager | Alireza S. Ziabari | Elnaz Rahmati | Aida Mostafazadeh Davani | Preni Golazizian | Farzan Karimi-Malekabadi | Ali Omrani | Zhihe Li | Brendan Kennedy | Georgios Chochlakis | Nils Karl Reimer | Melissa Reyes | Kesley Cheng | Mellow Wei | Christina Merrifield | Arta Khosravi | Evans Alvarez | Morteza Dehghani
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Jackson P. Trager | Alireza S. Ziabari | Elnaz Rahmati | Aida Mostafazadeh Davani | Preni Golazizian | Farzan Karimi-Malekabadi | Ali Omrani | Zhihe Li | Brendan Kennedy | Georgios Chochlakis | Nils Karl Reimer | Melissa Reyes | Kesley Cheng | Mellow Wei | Christina Merrifield | Arta Khosravi | Evans Alvarez | Morteza Dehghani
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Moral framing and sentiment can affect a variety of online and offline behaviors, including donation, environmental action, political engagement, and protest. Various computational methods in Natural Language Processing (NLP) have been used to detect moral sentiment from textual data, but achieving strong performance in such subjective tasks requires large, hand-annotated datasets. Previous corpora annotated for moral sentiment have proven valuable and have generated new insights both within NLP and across the social sciences, but have been limited to Twitter. To facilitate improving our understanding of the role of moral rhetoric, we present the Moral Foundations Reddit Corpus, a collection of 16,123 English Reddit comments that have been curated from 12 distinct subreddits, hand-annotated by at least three trained annotators for 8 categories of moral sentiment (i.e., Care, Proportionality, Equality, Purity, Authority, Loyalty, Thin Morality, Implicit/Explicit Morality) based on the updated Moral Foundations Theory (MFT) framework. We evaluate baselines using large language models (Llama3-8B, Ministral-8B) in zero-shot, few-shot, and PEFT (Parameter-Efficient Fine-Tuning) settings, comparing their performance to fine-tuned encoder-only models like BERT (Bidirectional Encoder Representations from Transformers). The results show that LLMs continue to lag behind fine-tuned encoders on this subjective task, underscoring the ongoing need for human-annotated moral corpora for AI alignment evaluation
2025
MFTCXplain: A Multilingual Benchmark Dataset for Evaluating the Moral Reasoning of LLMs through Multi-hop Hate Speech Explanation
Jackson Trager | Francielle Vargas | Diego Alves | Matteo Guida | Mikel K. Ngueajio | Ameeta Agrawal | Yalda Daryani | Farzan Karimi Malekabadi | Flor Miriam Plaza-del-Arco
Findings of the Association for Computational Linguistics: EMNLP 2025
Jackson Trager | Francielle Vargas | Diego Alves | Matteo Guida | Mikel K. Ngueajio | Ameeta Agrawal | Yalda Daryani | Farzan Karimi Malekabadi | Flor Miriam Plaza-del-Arco
Findings of the Association for Computational Linguistics: EMNLP 2025
Ensuring the moral reasoning capabilities of Large Language Models (LLMs) is a growing concern as these systems are used in socially sensitive tasks. Nevertheless, current evaluation benchmarks present two major shortcomings: a lack of annotations that justify moral classifications, which limits transparency and interpretability; and a predominant focus on English, which constrains the assessment of moral reasoning across diverse cultural settings. In this paper, we introduce MFTCXplain, a multilingual benchmark dataset for evaluating the moral reasoning of LLMs via multi-hop hate speech explanations using the Moral Foundations Theory. MFTCXplain comprises 3,000 tweets across Portuguese, Italian, Persian, and English, annotated with binary hate speech labels, moral categories, and text span-level rationales. Our results show a misalignment between LLM outputs and human annotations in moral reasoning tasks. While LLMs perform well in hate speech detection (F1 up to 0.836), their ability to predict moral sentiments is notably weak (F1 < 0.35). Furthermore, rationale alignment remains limited mainly in underrepresented languages. Our findings show the limited capacity of current LLMs to internalize and reflect human moral reasoning.
Search
Fix author
Co-authors
- Ameeta Agrawal 1
- Evans Alvarez 1
- Diego Alves 1
- Kesley Cheng 1
- Georgios Chochlakis 1
- Yalda Daryani 1
- Aida Mostafazadeh Davani 1
- Morteza Dehghani 1
- Preni Golazizian 1
- Matteo Guida 1
- Nils Karl Reimer 1
- Brendan Kennedy 1
- Arta Khosravi 1
- Zhihe Li 1
- Christina Merrifield 1
- Mikel K. Ngueajio 1
- Ali Omrani 1
- Flor Miriam Plaza-del-Arco 1
- Elnaz Rahmati 1
- Melissa Reyes 1
- Alireza S. Ziabari 1
- Jackson Trager 1
- Jackson P. Trager 1
- Francielle Vargas 1
- Mellow Wei 1