MASH: Evading Black-Box AI-Generated Text Detectors via Style Humanization

Yongtong Gu; Songze Li; Xia Hu

MASH: Evading Black-Box AI-Generated Text Detectors via Style Humanization

Abstract

The increasing misuse of AI-generated texts (AIGT) has motivated the rapid development of AIGT detection methods. However, the reliability of these detectors remains fragile against adversarial evasions. Existing attack strategies often rely on white-box assumptions or demand prohibitively high computational and interaction costs, rendering them ineffective under practical black-box scenarios. In this paper, we propose Multi-stage Alignment for Style Humanization (MASH), a novel framework that evades black-box detectors based on style transfer. MASH sequentially employs style-injection supervised fine-tuning, direct preference optimization, and inference-time refinement to shape the distributions of AI-generated texts to resemble those of human-written texts. Experiments across 6 datasets and 5 detectors demonstrate the superior performance of MASH over 11 baseline evaders. Specifically, MASH achieves an average Attack Success Rate (ASR) of 92%, surpassing the strongest baselines by an average of 24%, while maintaining superior linguistic quality.

Anthology ID:: 2026.findings-acl.1487
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 29749–29769
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1487/
DOI:
Bibkey:
Cite (ACL):: Yongtong Gu, Songze Li, and Xia Hu. 2026. MASH: Evading Black-Box AI-Generated Text Detectors via Style Humanization. In Findings of the Association for Computational Linguistics: ACL 2026, pages 29749–29769, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: MASH: Evading Black-Box AI-Generated Text Detectors via Style Humanization (Gu et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1487.pdf
Checklist:: 2026.findings-acl.1487.checklist.pdf

PDF Cite Search Checklist Fix data