When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection
Lang Gao, Xuhui Li, Chenxi Wang, Mingzhe Li, Wei Liu, Zirui Song, Jinghui Zhang, Rui Yan, Preslav Nakov, Xiuying Chen
Abstract
As large language models (LLMs) increasingly imitate personal writing styles, personalization has become a key challenge for machine-generated text (MGT) detection. Yet personalized MGT detection remains largely underexplored. In this work, we introduce StyloBench, the first benchmark for evaluating detector robustness under personalization, built from literary and blog texts paired with their LLM-generated imitations. Experiments across diverse detectors show pronounced performance instability under personalization, with frequent inversions relative to general-domain behavior. To better understand this limitation, we conduct an in-depth analysis and attribute it to a feature-inversion trap, i.e., features that are effective for separating human-written text (HWT) from MGT in general flip their effect in personalized contexts, ultimately misleading detectors. Motivated by this, we propose StyloCheck, a diagnostic framework for predicting detector robustness under personalization. StyloCheck identifies the inverted features and quantifies detector dependence using perturbed texts pronounced in the features. In our experiments, StyloCheck predicts both the direction and magnitude of cross-domain performance shifts with an 85% correlation to actual outcomes. We hope this work will raise awareness of the structural risks introduced by personalization and motivate more robust approaches to personalized MGT detection. The code is available at: https://github.com/mbzuai-nlp/Personalized_MGT_Detect- Anthology ID:
- 2026.acl-long.1998
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 43143–43171
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1998/
- DOI:
- Cite (ACL):
- Lang Gao, Xuhui Li, Chenxi Wang, Mingzhe Li, Wei Liu, Zirui Song, Jinghui Zhang, Rui Yan, Preslav Nakov, and Xiuying Chen. 2026. When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 43143–43171, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection (Gao et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.1998.pdf