A System for Dynamically Tracking Content Moderation on Reddit

George Arthur Baker, Bharadwaj Kadiyala


Abstract
Recent work in natural language processing, human-computer interaction, and computational social science takes interest in the study of decentralized content moderation, in which individual communities largely determine their own norms, rules, and enforcement thereof. A key challenge to this body of work is that, once moderated, content and related variables become difficult or impossible to recover; previous work often relied on 3rd-party historical data sources, but recent world events, legal disputes, and policy shifts have significantly disrupted these services, practically disabling their research use-cases. As a result, in order to conduct new research and reproduce previous results, researchers must record content as it’s created, and monitor variables of interest over time. In this paper we present and publicly release a software system for the dynamic monitoring of Reddit posts, communities, and moderation actions, to enable scalable and reproducible research on decentralized platform governance and content moderation. To the authors’ knowledge, at the time of publication this system is the only available solution for general-purpose, real-time, policy-compliant longitudinal data collection on Reddit. Furthermore, the system’s integration with the official Reddit API enables the collection of authentication-gated data such as community engagement metrics and moderation team information, which was unavailable in previous historical data sources.
Anthology ID:
2026.acl-demo.42
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Greg Durrett, Ping Jian
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
428–435
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-demo.42/
DOI:
Bibkey:
Cite (ACL):
George Arthur Baker and Bharadwaj Kadiyala. 2026. A System for Dynamically Tracking Content Moderation on Reddit. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 428–435, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
A System for Dynamically Tracking Content Moderation on Reddit (Baker & Kadiyala, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-demo.42.pdf