LLM-based Adversarial Dataset Augmentation for Automatic Media Bias Detection

Martin Wessel


Abstract
This study presents BiasAdapt, a novel data augmentation strategy designed to enhance the robustness of automatic media bias detection models. Leveraging the BABE dataset, BiasAdapt uses a generative language model to identify bias-indicative keywords and replace them with alternatives from opposing categories, thus creating adversarial examples that preserve the original bias labels. The contributions of this work are twofold: it proposes a scalable method for augmenting bias datasets with adversarial examples while preserving labels, and it publicly releases an augmented adversarial media bias dataset.Training on BiasAdapt reduces the reliance on spurious cues in four of the six evaluated media bias categories.
Anthology ID:
2025.latechclfl-1.3
Volume:
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Anna Kazantseva, Stan Szpakowicz, Stefania Degaetano-Ortlieb, Yuri Bizzoni, Janis Pagel
Venues:
LaTeCHCLfL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19–24
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.latechclfl-1.3/
DOI:
Bibkey:
Cite (ACL):
Martin Wessel. 2025. LLM-based Adversarial Dataset Augmentation for Automatic Media Bias Detection. In Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025), pages 19–24, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
LLM-based Adversarial Dataset Augmentation for Automatic Media Bias Detection (Wessel, LaTeCHCLfL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.latechclfl-1.3.pdf