Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection

Xiao Pu, Zepeng Cheng, Lin Yuan, Yu Wu, Xiuli Bi


Abstract
As large language models (LLMs) generate text that increasingly resembles human writing, the subtle cues that distinguish AI-generated content from human-written content become increasingly challenging to capture. Reliance on generator-specific artifacts is inherently unstable, since new models emerge rapidly and reduce the robustness of such shortcuts. This generalizes unseen generators as a central and challenging problem for AI-text detection. To tackle this challenge, we propose a progressively structured framework that disentangles AI-detection semantics from generator-aware artifacts. This is achieved through a compact latent encoding that encourages semantic minimality, followed by perturbation-based regularization to reduce residual entanglement, and finally a discriminative adaptation stage that aligns representations with task objectives. Experiments on MAGE benchmark, covering 20 representative LLMs across 7 categories, demonstrate consistent improvements over state-of-the-art methods, achieving up to 24.2% accuracy gain and 26.2% F1 improvement. Notably, performance continues to improve as the diversity of training generators increases, confirming strong scalability and generalization in open-set scenarios. Our source code will be publicly available at https://github.com/PuXiao06/DRGD.
Anthology ID:
2026.acl-long.120
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2586–2598
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.120/
DOI:
Bibkey:
Cite (ACL):
Xiao Pu, Zepeng Cheng, Lin Yuan, Yu Wu, and Xiuli Bi. 2026. Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2586–2598, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection (Pu et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.120.pdf
Checklist:
 2026.acl-long.120.checklist.pdf