Personalization up to a Point: Why Personalized Content Moderation Needs Boundaries, and How We Can Enforce Them
Emanuele Moscato, Tiancheng Hu, Matthias Orlikowski, Paul Röttger, Debora Nozza
Abstract
Personalized content moderation can protect users from harm while facilitating free expression by tailoring moderation decisions to individual preferences rather than enforcing universal rules. However, content moderation that is fully personalized to individual preferences, no matter what these preferences are, may lead to even the most hazardous types of content being propagated on social media. In this paper, we explore this risk using hate speech as a case study. Certain types of hate speech are illegal in many countries. We show that, while fully personalized hate speech detection models increase overall user welfare (as measured by user-level classification performance), they also make predictions that violate such legal hate speech boundaries, especially when tailored to users who tolerate highly hateful content. To address this problem, we enforce legal boundaries in personalized hate speech detection by overriding predictions from personalized models with those from a boundary classifier. This approach significantly reduces legal violations while minimally affecting overall user welfare. Our findings highlight both the promise and the risks of personalized moderation, and offer a practical solution to balance user preferences with legal and ethical obligations.- Anthology ID:
- 2025.emnlp-main.1726
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 34003–34017
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1726/
- DOI:
- Cite (ACL):
- Emanuele Moscato, Tiancheng Hu, Matthias Orlikowski, Paul Röttger, and Debora Nozza. 2025. Personalization up to a Point: Why Personalized Content Moderation Needs Boundaries, and How We Can Enforce Them. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 34003–34017, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Personalization up to a Point: Why Personalized Content Moderation Needs Boundaries, and How We Can Enforce Them (Moscato et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1726.pdf