Max Spero
2026
AI use in American newspapers is widespread, uneven, and rarely disclosed
Jenna Russell | Marzena Karpinska | Destiny Akinode | James Zhou | Katherine Thai | Bradley Emi | Max Spero | Mohit Iyyer
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jenna Russell | Marzena Karpinska | Destiny Akinode | James Zhou | Katherine Thai | Bradley Emi | Max Spero | Mohit Iyyer
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
AI is rapidly transforming journalism, but the extent of its use in published newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from online editions of 1.5K American newspapers published in the summer of 2025. Using Pangram, a state-of-the-art AI detector, we discover that approximately 9% of newly-published articles are either partially or fully AI-generated. This AI use is unevenly distributed, appearing more frequently in smaller, local outlets, in specific topics such as weather and technology, and within certain ownership groups. We also analyze 45K opinion pieces from Washington Post, New York Times, and Wall Street Journal, finding that they are 6.4 times more likely to contain AI-generated content than news articles from the same publications, with many AI-flagged op-eds authored by prominent public figures. Despite this prevalence, we find that AI use is rarely disclosed: a manual audit of 100 AI-flagged articles found only five disclosures of AI use. A factuality analysis shows AI-generated articles are 8.2 times more likely to contain hallucinated claims than human-written news. Overall, our audit highlights the immediate need for greater transparency and updated editorial standards regarding the use of AI in journalism to maintain public trust.
2025
Pangram at GenAI Detection Task 3: An Active Learning Approach to Machine-Generated Text Detection
Bradley N. Emi | Max Spero | Elyas Masrour
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
Bradley N. Emi | Max Spero | Elyas Masrour
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
We pretrain an autoregressive LLM-based detector on a wide variety of datasets, domains, languages, prompt schemes, and LLMs used to generate the AI portion of the dataset. We aggressively employ several augmentation strategies and preprocessing strategies to improve robustness. We then mine the RAID train set for the AI examples with the largest error based on the original classifier, and mix those examples and their human-written counterparts back into the training set. We then retrain the detector until convergence.
DAMAGE: Detecting Adversarially Modified AI Generated Text
Elyas Masrour | Bradley N. Emi | Max Spero
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
Elyas Masrour | Bradley N. Emi | Max Spero
Proceedings of the 1stWorkshop on GenAI Content Detection (GenAIDetect)
AI humanizers are a new class of online software tools meant to paraphrase and rewrite AI-generated text in a way that allows them to evade AI detection software. We study 19 AI humanizer and paraphrasing tools and qualitatively assess their effects and faithfulness in preserving the meaning of the original text. We show that many existing AI detectors fail to detect humanized text. Finally, we demonstrate a robust model that can detect humanized AI text while maintaining a low false positive rate using a data-centric augmentation approach. We attack our own detector, training our own fine-tuned model optimized against our detector’s predictions, and show that our detector’s cross-humanizer generalization is sufficient to remain robust to this attack.