Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection

Sahrish Khan; Arshad Jhumka; Gabriele Pergola

Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection

Sahrish Khan, Arshad Jhumka, Gabriele Pergola

Abstract

The detection of sexism in online content remains an open problem, as harmful language disproportionately affects women and marginalized groups. While automated systems for sexism detection have been developed, they still face two key challenges: data sparsity and the nuanced nature of sexist language. Even in large, well-curated datasets like the Explainable Detection of Online Sexism (EDOS), severe class imbalance hinders model generalization. Additionally, the overlapping and ambiguous boundaries of fine-grained categories introduce substantial annotator disagreement, reflecting the difficulty of interpreting nuanced expressions of sexism. To address these challenges, we propose two prompt-based data augmentation techniques: Definition-based Data Augmentation (DDA), which leverages category-specific definitions to generate semantically-aligned synthetic examples, and Contextual Semantic Expansion (CSE), which targets systematic model errors by enriching examples with task-specific semantic features. To further improve reliability in fine-grained classification, we introduce an ensemble strategy that resolves prediction ties by aggregating complementary perspectives from multiple language models. Our experimental evaluation on the EDOS dataset demonstrates state-of-the-art performance across all tasks, with notable improvements of macro F1 by 1.5 points for binary classification (Task A) and 4.1 points for fine-grained classification (Task C).

Anthology ID:: 2025.acl-long.809
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16553–16571
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.809/
DOI:
Bibkey:
Cite (ACL):: Sahrish Khan, Arshad Jhumka, and Gabriele Pergola. 2025. Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16553–16571, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection (Khan et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.809.pdf

PDF Cite Search Fix data