Abstract
This study investigates the robustness and generalization of transformer-based models for automatic media bias detection. We explore the behavior of current bias classifiers by analyzing feature attributions and stress-testing with adversarial datasets. The findings reveal a disproportionate focus on rare but strongly connotated words, suggesting a rather superficial understanding of linguistic bias and challenges in contextual interpretation. This problem is further highlighted by inconsistent bias assessment when stress-tested with different entities and minorities. Enhancing automatic media bias detection models is critical to improving inclusivity in media, ensuring balanced and fair representation of diverse perspectives.- Anthology ID:
- 2024.ltedi-1.3
- Volume:
- Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian's, Malta
- Editors:
- Bharathi Raja Chakravarthi, Bharathi B, Paul Buitelaar, Thenmozhi Durairaj, György Kovács, Miguel Ángel García Cumbreras
- Venues:
- LTEDI | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 21–30
- Language:
- URL:
- https://aclanthology.org/2024.ltedi-1.3
- DOI:
- Cite (ACL):
- Martin Wessel and Tomáš Horych. 2024. Beyond the Surface: Spurious Cues in Automatic Media Bias Detection. In Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 21–30, St. Julian's, Malta. Association for Computational Linguistics.
- Cite (Informal):
- Beyond the Surface: Spurious Cues in Automatic Media Bias Detection (Wessel & Horych, LTEDI-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.ltedi-1.3.pdf